Frontiers of Supercomputing II "d0e12741"

NSF Supercomputing Program

Larry Smarr

Larry Smarr is currently a professor of physics and astronomy at the University of Illinois-Urbana/Champaign and since 1985 has also been the Director of the National Center for Supercomputing Applications.

He received his Ph.D. in physics from the University of Texas at Austin. After a postdoctoral appointment at Princeton University, Dr. Smarr was a Junior Fellow in the Harvard University Society of Fellows. His research has resulted in the publication of over 50 scientific papers.

Dr. Smarr was the 1990 recipient of the Franklin Institute's Delmer S. Fahrney Medal for Leadership in Science or Technology.

I attended the 1983 Frontiers of Supercomputing conference at Los Alamos National Laboratory, when the subject of university researchers regaining access to supercomputers—after a 15-year hiatus—was first broached. There was a lot of skepticism as to whether such access would be useful to the nation. That attitude was quite understandable at the time. The university community is the new kid on the block, so far as participants at this conference are concerned, and we were not substantially represented at the meeting in 1983.

Today, the attitude is quite different. Part of my presentation will be devoted to what has changed since 1983.

As you know, in 1985–86 the National Science Foundation (NSF) set up five supercomputing centers, one of which has since closed (see Al

― 404 ―

Brenner's paper, Session 12). The four remaining centers are funded through 1995. Three of the four supercomputer center directors are attending this conference. Apart from myself, representing the National Center for Supercomputing Applications (NCSA) at the University of Illinois, there is Sid Karin from San Diego and Michael Levine from Pittsburgh, as well as the entire NSF hierarchy—Rich Hirsh, Mel Ciment, Tom Weber, and Chuck Brownstein, right up to the NSF Director, Erich Bloch (a Session 1 presenter).

During the period 1983–86, we started with no network. The need to get access to the supercomputer centers was the major thing that drove the establishment of the NSF network. The current rate of usage of that network is increasing at 25 per cent, compounded, per month. So it's a tremendous thing.

There were three universities that had supercomputers when the program started; there are now well over 20. So the capacity in universities has expanded by orders of magnitude during this brief period. During those five years, alone, we've been able to provide some 11,000 academic users, who are working on almost 5000 different projects, access to supercomputers, out of which some 4000 scientific papers have come. We've trained an enormous number of people, organized scientific symposia, and sponsored visiting scientists on a scale unimagined in 1983.

What I think is probably, in the end, most important for the country is that affiliates have grown up—universities, industries, and vendors—with these centers. In fact, there are some 46 industrial partners of the sort discussed extensively in Session 9, that is, the consumers of computers and communications services. Every major computer/communications-services vendor is also a working partner with the centers and, therefore, getting feedback about what we need in the future.

If I had to choose one aspect, one formerly pervasive attitude, that has changed, it's the politics of inclusion. Until the NSF centers were set up, I would say most supercomputer centers were operated by exclusion, that is, inside of laboratories that were fairly well closed. There was no access to them, except for, say, the Department of Energy Magnetic Energy Facility and the NSF National Center for Atmospheric Research. In contrast, the NSF centers' goal is to provide access to anyone in the country that has a good idea and the capability of trying it out.

Also, unlike almost all the other science entities in the country, instead of being focused on a particular science and engineering mission, we are open to all aspects of human knowledge. That's not just the natural

― 405 ―

sciences. As you know, many exciting breakthroughs in computer art and music and in the social sciences have emerged from the NSF centers.

If you imagine supercomputer-center capacity represented by a pie chart (Figure 1), the NSF directorate serves up the biggest portion to the physical sciences. Perhaps three-quarters of our cycles are going to quantum science. I find it very interesting to recall being at Livermore in the 1970s, and it was all continuum field theory, fluid dynamics, and the like. So the whole notion of which kind of basic science these machines should be working on has flip-flopped in a decade, and that's a very profound change.

The centers distribute far more than cycles. They're becoming major research centers in computational science and engineering. We have our own internal researchers—world-class people in many specialties—that work with the scientific community, nationwide; some of the most important workshops in the field are being sponsored by the centers. You're also seeing us develop software tools that are specific to particular disciplines: chemistry, genome sequencing, and so forth. That will be a significant area of growth in the future.

There's no preexisting organizational structure in our way of doing science because the number of individuals who do computing in any field of science is still tiny. Their computational comrades are from biology,

Figure 1.
Usage, by discipline, at NSF supercomputing centers.

― 406 ―

chemistry, engineering—you name it—and there are no national meetings and no common structure that holds them together culturally. So the centers are becoming a major socializing force in this country.

What we are seeing, as the centers emerge from their first five-year period of existence and enter the next five-year period, is a move from more or less off-the-shelf, commercially available supercomputers to a very wide diversity of architectures. Gary Montry, in his paper (see Session 6), represents the divisions of parallel architecture as a branching tree. My guess is that you, the user, will have access to virtually every one of those little branches in one of the four centers during the next few years.

Now, with respect to the killer-micro issue (also discussed by Ben Barker in Session 12), in the four extant centers we have about 1000 workstations and personal computers, and at each center we have two or three supercomputers. Just like all of the other centers represented here, we at NCSA have focused heavily on the liberating and enabling aspect of the desktop. In fact, I would say that at the NSF centers from the beginning, the focus has been on getting the best desktop machine in the hands of the user and getting the best network in place—which in turn drives more and more use of supercomputers. If you don't have a good desktop machine, you can't expect to do supercomputing in this day and age. So workstations and supercomputers form much more of a symbiosis than a conflict. Furthermore, virtually every major workstation manufacturer has a close relationship with one or more of the centers.

The software tools that are developed at our centers in collaboration with scientists and then released into the public domain are now being used by over 100,000 researchers in this country, on their desktops. Of those, maybe 4000 use the supercomputer centers. So we have at least a 25-to-one ratio of people that we've served on the desktop, compared with the ones that we've served on the supercomputers, and I think that's very important. The Defense Advanced Research Projects Agency has, as you may know, entered into a partnership with NSF to help get some of these alternate architectures into the centers. In the future, you're going to see a lot of growth as a result of this partnership.

The total number of CRAY X-MP-equivalent processor hours that people used at all five centers (Figure 2) has steadily increased, and there is no sign of that trend tapering off. What I think is more interesting is the number of users who actually sign on in a given month and do something on the machines (Figure 3). There is sustained growth, apart from a period in late 1988, when the capacity didn't grow very fast and the machines became saturated, discouraging some of the users. That was a very clear warning to us: once you tell the scientific community that

― 407 ―

Figure 2.
Total CRAY X-MP-equivalent processor hours used in five NSF supercomputing
centers.

Figure 3.
Number of active users at five NSF supercomputer centers.

― 408 ―

you're going to provide a new and essential tool, you've made a pact. You have to continue upgrading on a regular, and rapid, basis, else the user will become disenchanted and do some other sort of science that doesn't require supercomputers. We think that this growth will extend well into the future.

I am especially excited about the fact that users, in many cases, are for the first time getting access to advanced computers and that the number of first-time users grew during the time that desktops became populated with ever-more-powerful computers. Instead of seeing the demand curve dip, you're seeing it rise even more sharply. Increasingly, you will see that the postprocessing, the code development, etc., will take place at the workstation, with clients then throwing their codes across the network for the large uses when needed.

Who, in fact, uses these centers? A few of our accounts range upwards of 5000 CPU hours per year, but 95 per cent of our clients consume less than 100 hours per year (Figure 4). The implication is that much of the work being done at the centers could be done on desktop machines. Yet, these small users go to the trouble to write a proposal, go through peer review, experience uncertainty over periods of weeks to months as to whether and when they'll actually get on the supercomputer, and then have to go through what in many cases is only a 9600-Baud connect by the time we get down to the end of the regional net.

Figure 4.
Percentage of total users versus annual CPU-hour consumption, January FY 1988 through April FY 1990: 95 per cent of all users consume
less than 100 CPU hours per year.

― 409 ―

It's like salmon swimming upstream: you can't hold them back. We turn down probably 50 per cent of the people who want to get on the machine, for lack of capacity.

What has happened here is that the national centers perform two very different functions. First, a great many of the users, 95 per cent of them, are being educated in computational science and engineering, and they are using their workstations simultaneously with the supercomputers. In fact, day to day, they're probably spending 90 per cent of their working hours on their desktop machines. Second, because of the software our centers have developed, the Crays, the Connection Machines, the Intel Hypercubes are just windows on their workstations. That's where they are, that's how they feel.

You live on your workstation. The most important computer to you is the one at your fingertips. And the point is, with the network, and with modern windowing software, everything else in the country is on your desktop. It cuts and pastes right into the other windows, into a word processor for your electronic notebook.

For instance, at our center, what's really amazing to me is that roughly 20 per cent of our users on a monthly basis are enrolled in courses offered by dozens of universities—courses requiring the student to have access to a supercomputer through a desktop Mac. That percentage has gone up from zero in the last few years.

These and the other small users, representing 95 per cent of our clients, consume only 30 per cent of the cycles. So 70 per cent of the cycles, the vast majority of the cycles, are left for a very few clients who are attacking the grand-challenge problems.

I think this pattern will persist for a long time, except that the middle will drop out. Those users who figure it out, who know what software they want to do, will simply work on their RISC workstations. That will constitute a very big area of growth. And that's wonderful. We've done our job. We got them started.

You can't get an NSF grant for a $50,000 workstation unless you've got a reputation. You can't get a reputation unless you can get started. What the country lacked before and what it has now is a leveraging tool. Increasing the human-resource pool in our universities by two orders of magnitude is what the NSF centers have accomplished.

But let me return to what I think is the central issue. We've heard great success stories about advances in supercomputing from every agency, laboratory, and industry—but they're islands. There is no United States of Computational Science and Engineering. There are still umpteen colonies or city-states. The network gives us the physical wherewithal to

― 410 ―

change that. However, things won't change by themselves. Change requires political will and social organization. The NSF centers are becoming a credible model for the kind of integration that's needed because, just in terms of the dollars, alone (not the equipment-in-kind and everything else—just real, fundable dollars), this is the way the pie looks in, say, fiscal year 1989 (FY 1989) (Figure 5).

There is a great deal of cost sharing among each of the key sectors—the state, the regional areas, the consumers of the computers, the producers. NSF is becoming the catalyst pulling these components together. Next, we need to do something similar, agency to agency. The High Performance Computing Initiative (HPCI) is radical because it is a prearranged, multiagency, cooperative approach. The country has never seen that happen before. Global Change is the only thing that comes close. But that program hasn't had the benefit of a decade of detailed planning the way HPCI has. It's probably the first time our country has tried anything like it.

I would like to hear suggestions on how we might mobilize the people attending this conference—the leaders in all these islands—and, using the political and financial framework afforded by HPCI over the rest of this decade, change our way of doing business. That's our challenge. If we can meet that challenge, we won't need to worry about competitiveness in any form. Americans prevail when they work together. What we are not good at is making that happen spontaneously.

Figure 5.
Supercomputer-center cost sharing, FY 1989.

― 411 ―