previous sub-section
A Look at Worldwide High-Performance Computing and Its Economic Implications for the U.S.*
next sub-section

A Brief Technical Overview of the Present-Day Landscape

The United States has historically been the dominant country in the world in terms of both supercomputer development and application. The U.S. has the lead in both vector and parallel processing, and Cray Research, Inc., continues to be the preeminent company in the high-performance system industry. Moreover, the wide spectrum of approaches employed by U.S. supercomputer developers has resulted in an extremely fertile research domain from which a number of commercially successful companies have emerged—CONVEX Computer Corporation, Thinking Machines Corporation, and nCUBE Corporation among them. The U.S. high-performance-system user base can claim a sophistication exceeding or roughly equal to that in any other country.

However, we are not alone. A number of countries have undertaken extensive research efforts in the high-performance computing arena, including the Soviet Union, Japan, and some in Western Europe. Others, such as Bulgaria, Israel, and China, have initiated research in this area, and many countries now employ supercomputers. In this section we examine some of the more substantial efforts worldwide.

The Soviet Union

The Soviets have a long history of high-performance computing. The USSR began research into computing shortly after World War II and produced functional digital computers in the early 1950s. The first efforts in parallel processing began in the early 1960s, and research in this area has continued steadily since then.

Soviet scientists have explored a wide spectrum of approaches in developing high-performance systems but with little depth in any one. Consequently, the Soviets have yet to make a discernible impact on the global corpus of supercomputing research. The Soviets to date have


417

neither put into serial production a computer of CRAY-1 performance or greater—only within the last few years have they prototyped a machine at that level—nor have they yet entered the worldwide supercomputer market. However, Soviet high-performance computing efforts conducted within the Academy of Sciences have exhibited higher levels of innovation than have their efforts to develop mainframes, minicomputers, and microcomputers.[*]

The BESM-6, a machine that is capable of a million instructions per second (MIPS) and was in serial production from 1965 to 1984, has been, until recently, the workhorse of the Soviet scientific community. The concept of a recursive-previous hit architecture next hit machine with a recursive internal language, recursive memory structure, recursive interconnects, etc., was reported by Glushkov et al. (1974). The ES-2704, which only recently entered limited production, is a machine embodying these architectural and data-flow features. Computation is represented as a computational node in a graph. The graph expands as nodes are decomposed and contracts as results are combined into final results.

The ES-2701, developed at the Institute of Cybernetics in Kiev, like the ES-2704, incorporates distributed-memory flexible interconnects but is based on a different computational paradigm—there called a macropipeline computation—in which pipelining occurs at the algorithm level. Computation, under some problems, progresses as a wave across the processor field as data and intermediate results are passed from one processor to the next.

The ES-2703 is promoted as a programmable-previous hit architecture next hit machine. The previous hit architecture next hit is based on a set of so-called macroprocessors connected by a crossbar switch that may be tuned by the programmer. The "macro" designation denotes microcode or hardware implementation of complex mathematical instructions.

The El'brus project is the most heavily funded in the Soviet Union. The El'brus-1 and -2 were strongly influenced by the Burroughs 700-series previous hit architecture next hit, with its large-grain parallelism, multiple processors sharing banks of common memory, and stack-based previous hit architecture next hit for the individual processors. A distinguishing feature of this first El'brus machine stemmed from the designers' decision to use, in lieu of an assembly language, an Algol-like, high-level procedural language with underlying hardware support. This compelled the El'brus design team to


418

maintain software compatibility across the El'brus family at the level of a high-level language, which in turn enabled them to use very different previous hit architectures next hit for some of their later models (e.g., the El'brus-3 and mini-El'brus, both very-long-instruction-word machines).

Most of the more successful machines, from the point of view of production, have been developed through close cooperation between the Academy of Sciences and industry organizations. One such machine, the PS-2000, was built by an organization in the eastern Ukraine—the Impul's Scientific Production Association. The PS-2000 could have up to 64 processors operating in a SIMD fashion, and its successor, the PS-2100, combines 10 groupings of the 64 processors, with the whole complex then being able to operate in a MIMD fashion. Although now out of production, 200 PS-2000s were produced in various configurations and now are actively used primarily in seismic and other energy-related applications. Series production of the PS-2100 began in 1990.

The development of high-performance computing in the Soviet Union is hindered by a number of problems. For one, the supply of components, both from indigenous suppliers and from the West, is inconsistent. Moreover, the state of mass storage is very weak. The 317-megabyte disks, which not long ago represented the Soviet state of the art, continue to be quite rare. Further, perestroika -related changes have caused sharp reductions in funding of several novel previous hit architecture next hit projects, and a number have been terminated.

Western Europe

In Western Europe, while there has been no prominent commercial attempt to build vector processors, much attention has been paid to developing distributed processing and massively parallel, primarily Transputer-based, processors. Efforts in this realm have resulted in predominantly T-800 Transputer-based machines claiming processing rates of 1.5 million floating-point operations per second (MFLOPS) per processor, with up to 1000 processors and with RISC-based chips promising to play a sizable role in the future. To date, however, the Europeans have been low-volume producers, with few companies having shipped more than a handful of machines. Two such exceptions are the U.K.'s Meiko and Germany's Parsytec.

Meiko and Parsytec have proved to be the two most commercially successful European supercomputer manufacturers, with over 300 and 600 customers worldwide, respectively. Meiko produces two scalable, massively parallel dynamic-previous hit architecture next hit machines—the Engineer's Computing Surface and the Embedded Real-Time Computing Surface—


419

with no inherent architectural limit on the number of processors. Among Meiko's clients are several branches of the U.S. military and the National Security Agency. Parsytec's two Transputer-based MIMD systems, the MultiCluster and SuperCluster, are available in configurations with maximums of 64 and 400 processors, respectively.

Lesser manufacturers of high-performance computing include Parsys, Active Memory Technology (AMT), and ESPRIT—the European Strategic Program for Research and Development in Information Technology.[*] The U.K.-based Parsys is the producer of the SuperNode 1000, another Transputer-based parallel processor, with 16 to 1024 processors in hierarchical, reconfigurable arrays. AMT's massively parallel DAP/CP8 510C (1024 processors) and 610C (4096 processors) boast processing speeds of 5000 MIPS (140 MFLOPS) and 20,000 MIPS (560 MFLOPS), respectively. Spearheaded by the Germans, ESPRIT's SUPRENUM project has produced the four-GFLOPS, MIMD SUPRENUM-1 and is continuing development of the more powerful SUPRENUM-2.

The Europeans have proved themselves as experts in utilizing vector processors as workhorses. Vector processors can be found in use in Germany, France, and England. Though the Europeans have been extensive users of U.S.-made machines, Japanese machines have recently started to penetrate the European market.

Japan

Japan is maturing in its use and production of high-performance systems. The Japanese have elevated vector processing to a fine art, both in the case of hardware and software, and are producing world-class systems that rival those of Cray. Moreover, the installed base of supercomputers in Japan has climbed to over 150, the number of Japanese researchers working in the realm of computational science and engineering is growing, and the quality of their work is improving.

The first vector processors to emerge from Japan, such as the Fujitsu VP-200, generated a lot of excitement. Initial benchmarks indicated that these early supercomputers, with lots of vector pipelines—characteristic of the Japanese machines—were very fast. The Fujitsu machine was followed by the Hitachi S-820 and then the Nippon Electric Company (NEC) SX-2, which was, at that time, the fastest single processor in the


420

world.[*] These machines also boasted many vector pipes, as well as automatic interactive vectorizing tools of high quality.

Recent Japanese announcements indicate that the trend toward greater vectorization will continue. The NEC SX-3, for example, employs a processor that can produce 16 floating-point results every three-nanosecond clock cycle, a performance that amounts to more than five GFLOPS per processor.

It merits mention, however, that while Japanese high-performance computers compete well in the "megaflop derby," their sustained performance on production workloads remains unknown. Huge memory bandwidth hides behind the caches of these Japanese machines, and the memories are a fairly long distance from the processors, which probably inhibits their short vector performance.

Parallel processing is not, however, being ignored in Japan. The Japanese have a number of production parallel processors now to which they are devoting much attention. In at least two areas of parallel processing, the Japanese have made significant progress. Most, if not all, Japanese semiconductor manufacturers are using massively parallel circuit simulators, and the NEC fingerprint identification machine, used in police departments worldwide, represents one of the largest-selling massively parallel processors in the world.

The Japanese recently have begun showing signs of accommodating U.S. markets. For one thing, Japanese manufacturers are exhibiting some willingness to accommodate the IEEE and Cray floating-point arithmetic formats, in addition to the IBM format their machines currently support. Secondly, some machines, notably the SX-3, now run UNIX. These and other existing signs indicate that the Japanese seek not only to accommodate the American market but to aggressively enter it.

The software products available on Japanese supercomputers and the monitoring tools available to scientific applications programmers from Japanese vendors appear to be as good as or better than those available from Cray Research. Consequently, applications software being developed in Japan may be better vectorized as a result of the better tools and vendor-supplied software. Further, Japanese supercomputer centers seem to be having little, if any, difficulty obtaining access to the best U.S.-developed applications software.

While the U.S. appears to be preeminent in all basic research areas of computational science and engineering, the Japanese are making


421

significant strides as the current generation of researchers matures in its use of supercomputers and a younger generation is trained in computational science and engineering. The environment in which Japanese researchers work is also improving, with supercomputer time and better software tools being made increasingly available. Networking within the Japanese supercomputing community, however, remains underdeveloped.

The American, Soviet, European, and Japanese machines and their parameters are compared in Table 1.


previous sub-section
A Look at Worldwide High-Performance Computing and Its Economic Implications for the U.S.*
next sub-section