David B. Nelson
David B. Nelson is Executive Director of the Office of Energy Research, U.S. Department of Energy (DOE), and concurrently, Director of Scientific Computing. He is also Chairman of the Working Group on High Performance Computing and Communications, an organization of the Federal Coordinating Committee on Science, Engineering, and Technology. His undergraduate studies were completed at Harvard University, where he majored in engineering sciences, and his graduate work was completed at the Courant Institute of Mathematical Sciences at New York University, where he received an M.S. and Ph.D. in mathematics.
Before joining DOE, Dr. Nelson was a research scientist at Oak Ridge National Laboratory, where he worked mainly in theoretical plasma physics as applied to fusion energy and in defense research. He headed the Magnets-Hydrodynamics Theory Group in the Fusion Energy Division.
I believe all the discussion at the conference can be organized around the vision of a seamless, heterogeneous, distributed, high-performance computing environment that has emerged during the week and that K. Speierman alluded to in his remarks (see Session 1). The elements in this environment include, first of all, the people—skilled, imaginative users, well trained in a broad spectrum of applications areas. The second ingredient of that environment is low-cost, high-performance, personal
workstations and visualization engines. The third element is mass storage and accessible, large knowledge bases. Fourth is heterogeneous high-performance compute engines. Fifth is very fast local, wide-area, and national networks tying all of these elements together. Finally, there is an extensive, friendly, productive, interoperable software environment.
As far as today is concerned, this is clearly a vision. But all of the pieces are present to some extent. In this summary I shall work through each of these elements and summarize those aspects that were raised in the conference, both the pluses and the minuses.
Now, we can't lose sight of what this environment is for. What are we trying to do? The benefits of this environment will be increased economic productivity, improved standard of living, and improved quality of life. This computational environment is an enabling tool that will let us do things that we cannot now do, imagine things that we have not imagined, and create things that have never before existed. This environment will also enable greater national and global security, including better understanding of man's effect on the global environment.
Finally, we should not ignore the intellectual and cultural inspiration that high-performance computing can provide to those striving for enlightenment and understanding. That's a pretty tall order of benefits, but I think it's a realistic one; and during the conference various presenters have discussed aspects of those benefits.
Skilled, Imaginative Users and a Broad Spectrum of Applications
It's estimated that the pool of users trained in high-performance computing has increased a hundredfold since our last meeting in 1983. That's a lot of progress. Also, the use of high-performance computing in government and industry has expanded into many new and important areas since 1983. We were reminded that in 1983, the first high-performance computers for oil-reservoir modeling were just being introduced. We have identified a number of critical grand challenges whose solution will be enabled by near-future advances in high-performance computing.
We see that high-performance computing centers and the educational environment in which they exist are key to user education in computational science and in engineering for industry. You'll notice I've used the word "industry" several times. Unfortunately, the educational pipeline for high-performance computing users is drying up, both through lack of new entrants and through foreign-born people being pulled by advantages and attractions back to their own countries.
One of the points mentioned frequently at this conference, and one point I will be emphasizing, is the importance of broadening the use of this technology into wider industrial applications. That may be one of the most critical challenges ahead of us. Today there are only pockets of high-performance computers in industry.
Finally, the current market for high-performance computing—that user and usage base—is small and increasingly fragmented because of the choices now being made available to potential and actual users of high-performance computing.
Workstations and Visualization Engines
The next element that was discussed was the emergence of low-cost, high-performance personal workstations and visualization engines. This has happened mainly since 1983. Remember that in 1983 most of us were using supercomputers through glass teletypes. There has been quite a change since then.
The rapid growth in microprocessor compatibility has been a key technology driver for this. Obviously, the rapid fall in microprocessor and memory costs has been a key factor in enabling people to buy these.
High-performance workstations allow cooperative computing with compute engines. As was pointed out, they let supercomputers be supercomputers by off-loading smaller jobs. The large and increasing installed base of these workstations, plus the strong productive competition in the industry, is driving hardware and software standards and improvements.
Next, several of the multiprocessor workstations that are appearing now include several processors that allow a low-end entry into parallel processing. It was pointed out that there may be a million potential users of these, as compared with perhaps 10,000 to 100,000 users of the very high-end parallel machines. So this is clearly the broader base and therefore the more likely entry point.
Unfortunately, in my opinion, this very attractive, seductive, standalone environment may deflect users away from high-end machines, and it's possible that we will see a repetition on a higher plane of the VAX syndrome of the 1970s. That syndrome caused users to limit their problems to those that could be run on a Digital Equipment Corporation VAX machine; a similar phenomenon could stunt the growth of high-performance computing in the future.
Mass Storage and Accessible Knowledge Bases
Mass storage and accessible, large knowledge bases were largely ignored in this meeting—and I think regrettably so—though they are very important. There was some discussion of this, but not at the terabyte end.
What was pointed out is that mass-storage technology is advancing slowly compared with our data-accumulation and data-processing capabilities. There's an absence of standards for databases such that interoperability and human interfaces to access databases are hit-and-miss things.
Finally, because of these varied databases, we need but do not have expert systems and other tools to provide interfaces for us. So this area largely remains a part of the vision and not of the accomplishment.
Heterogeneous High-Performance Computer Engines
What I find amazing, personally, is that it appears that performance on the order of 1012 floating-point operations per second—a teraflops or a teraops, depending on your culture—is really achievable by 1995 with known technology extrapolation. I can remember when we were first putting together the High Performance Computing Initiative back in the mid-1980s and asking ourselves what a good goal would be. We said that we would paste a really tough one up on the wall and go for a teraflops. Maybe we should have gone for a petaflops. The only way to achieve that goal is by parallel processing. Even today, at the high end, parallel processing is ubiquitous.
There isn't an American-made high-end machine that is not parallel. The emergence of commercially available massively parallel systems based on commodity parts is a key factor in the compute-engine market—another change since 1983. Notice that it is the same commodity parts, roughly, that are driving the workstation evolution as are driving these massively parallel systems.
We are still unsure of the tradeoffs—and there was a lively debate about this at this meeting—between fewer and faster processors versus more and slower processors. Clearly the faster processors are more effective on a per-processor basis. On a system basis, the incentive is less clear. The payoff function is probably application dependent, and we are still searching for it. Fortunately, we have enough commercially available architectures to try out so that this issue is out of the realm of academic discussion and into the realm of practical experiment.
Related to that is an uncertain mapping of the various architectures available to us into the applications domain. A part of this meeting was
the discussion of those application domains and what the suitable architectures for them might be. Over the next few years I'm sure we'll get a lot more data on this subject.
It was also brought out that it's important that one develops balanced systems. You have to have appropriate balancing of processor power, memory size, bandwidth, and I/O rates to have a workable system. By and large, it appears that there was consensus in this conference on what that balance should be. So at least we have some fairly good guidelines.
There was some discussion at this conference of new or emerging technologies—gallium arsenide, Josephson junction, and optical—which may allow further speedups. Unfortunately, as was pointed out, gallium arsenide is struggling, Josephson junction is Japanese, and optical is too new to call.
Fast, Local, Wide-Area, and National Networks
Next, let's turn to networking, which is as important as any other element and ties the other elements together.
Some of the good news is that we are obtaining more standards for I/O channels and networks. We have the ability to build on top of these standards to create things rather quickly. As an example of that, I mention the emergence of the Internet and the future National Research and Education Network (NREN) environment, which is based on standards, notably transmission control protocol/Internet protocol, and on open systems and has already proved its worth.
Unfortunately, as we move up to gigabit speeds, which we know we will require for a balanced overall system, we're going to need new hardware and new protocols. Some of the things that we can do today simply break down logically or electrically when we get up to these speeds. Still, there's going to be a heavy push to achieve gigabit speeds.
Another piece of bad news is that today the network services are almost nonexistent. Such simple things as yellow pages and white pages and so on for national networks just don't exist. This is like the Russian telephone system: if you want to call somebody, you call them to find out what their telephone number is because there are no phone books.
Another issue that was raised is how we can extend the national network into a broader user community. How can we move NREN out quickly from the research and education community into broader industrial usage? Making this transition will require dealing with sticky issues such as intellectual property rights, tariffed services, and interfaces.
Archiving an extensive, friendly, productive, interoperable software environment was acknowledged to be the most difficult element to achieve. We do have emerging standards such as UNIX, X Windows, and so on that help us to tie together software products as we develop them.
The large number of workstations, as has previously been the case with personal computers, has been the motivating factor for developing quality, user-friendly interfaces. These are where the new things get tried out. That's partly based on the large, installed base and, therefore, the opportunities for profitable experimentation.
Now, these friendly interfaces can be used and to some extent are used as access to supercomputers, but discussion during this conference showed that we have a way to go on that score. Unfortunately—and this was a main topic during the meeting—we do not have good standards for software portability and interfaces in a heterogeneous computing environment. We're still working with bits and pieces. It was acknowledged here that we have to tie the computing environment together through software if we are to have a productive environment. Finally—this was something that was mentioned over and over again—significant differences exist in architectures, which impede software portability.
First, if we look back to the meeting in 1983, we see that the high-performance computing environment today is much more complex. Before, we could look at individual boxes. Largely because of the reasons that I mentioned, it's now a distributed system. To have an effective environment today, the whole system has to work. All the elements have to be at roughly the same level of performance if there is to be a balanced system. Therefore, the problem has become much more complex and the solution more effective.
To continue high-performance computing advances, it seems clear from the meeting that we need to establish effective mechanisms to coordinate our activities. Any one individual, organization, or company can only work on a piece of the whole environment. To have those pieces come together so that you don't have square plugs for round holes, some coordination is required. How to do that is a sociological, political, and cultural problem. It is at least as difficult, and probably rather more so, than the technical problems.
Next, as I alluded to before, high-performance computing as a business will live or die—and this is a quote from one of the speakers—according to its acceptance by private industry. The market is currently too small, too fragmented, and not growing at a rapid enough rate to remain operable. Increasing the user base is imperative. Each individual and organization should take as a challenge how we can do that. To some extent, we're the apostles. We believe , and we know what can be done, but most of the world does not.
Finally, let me look ahead. In my opinion, there's a clear federal role in high-performance computing. This role includes, but is not limited to, (1) education and training, (2) usage of high performance for agency needs, (3) support for R&D, and (4) technology cooperation with industry. This is not transfer; it goes both ways. Let's not be imperialistic.
The federal role in these and other areas will be a strong motivator and enabler to allow us to achieve the vision discussed during this meeting. It was made clear over and over again that the federal presence has been a leader, driver, catalyst, and strong influence on the development of the high-performance computing environment to date. And if this environment is to succeed, the federal presence has to be there in the future.
We cannot forget the challenge from our competitors and the fact that if we do not take on this challenge and succeed with it, there are others who will. The race is won by the fleet, and we need to be fleet.