Government Bodies As Investors
Barry Boehm
Barry W. Boehm is currently serving as Director of the Defense Advanced Research Projects Agency's (DARPA's) Software and Intelligent Systems Technology Office, the U.S. Government's largest software research organization. He was previously Director of DARPA's Information Science and Technology Office, which also included DARPA's research programs in high-performance computing and communications.
I'd like to begin by thanking the participants at this conference for reorienting my thinking about high-performance computing (HPC). Two years ago, to the extent that I thought of the HPC community at all, I tended to think of it as sort of an interesting, exotic, lost tribe of nanosecond worshippers.
Today, thanks to a number of you, I really do feel that it is one of the most critical technology areas that we have for national defense and economic security. Particularly, I'd like to thank Steve Squires (Session 4 Chair) and a lot of the program managers at the Defense Advanced Research Projects Agency (DARPA); Dave Nelson and the people in the Federal Coordinating Committee on Science, Engineering, and Technology (FCCSET); Gene Wong for being the right person at the right time to move the High Performance Computing Initiative along; and finally all of you who have contributed to the success of this meeting. You've really given me a lot better perspective on what the community is like and what its needs and concerns are, and you've been a very stimulating group of people to interact with.
In relation to Gene's list, set forth in the foregoing presentation, I am going to talk about government bodies as investors, give DARPA as an example, and then point to a particular HPC opportunity that I think we have as investors in this initiative in the area of software assets.
If you look at the government as an investor, it doesn't look that much different than Cray Research, Inc., or Thinking Machines Corporation or IBM or Boeing or the like. It has a limited supply of funds, it wants to get long-range benefits, and it tries to come up with a good investment strategy to do that.
Now, the benefits tend to be different. In DARPA's case, it's effective national defense capabilities; for a commercial company, it's total corporate value in the stock market, future profit flows, or the like.
The way we try to do this at DARPA is in a very interactive way that involves working a lot with the Department of Defense (DoD) users and operators and aerospace industry, trying to figure out the most important things that DoD is going to need in the future, playing those off against what technology is likely to supply, and evaluating these in terms of what their relative cost-benefit relationships are. Out of all of that comes an R&D investment strategy. And I think this is the right way to look at the government as an investor.
The resulting DARPA investment strategy tends to include things like HPC capabilities, not buggy whips and vacuum tubes. But that does not mean we're doing this to create industrial policy. We're doing this to get the best defense capability for the country that we can.
The particular way we do this within DARPA is that we have a set of investment criteria that we've come up with and use for each new proposed program that comes along. The criteria are a little bit different if you're doing basic research than if you're doing technology applications, but these tend to be common to pretty much everything that we do.
First, there needs to be a significant long-range DoD benefit, generally involving a paradigm shift. There needs to be minimal DoD ownership costs. Particularly with the defense budget going down, it's important that these things not be a millstone around your neck. The incentive to create things that are commercializable, so that the support costs are amortized across a bigger base, is very important.
Zero DARPA ownership costs: we do best when we get in, hand something off to somebody else, and get on to the next opportunity that's there. That doesn't mean that there's no risk in the activity. Also, if Cray is already doing it well, if IBM is doing it, if the aerospace industry is doing it, then there's no reason for DARPA to start up something redundant.
A good many of DARPA's research and development criteria, such as good people, good new ideas, critical mass, and the like, are self-explanatory. And if you look at a lot of the things that DARPA has done in the past, like ARPANET, interactive graphics, and Berkeley UNIX, you see that the projects tend to fit these criteria reasonably well.
So let me talk about one particular investment opportunity that I think we all have, which came up often during the conference here. This is the HPC software problem. I'm exaggerating it a bit for effect here, but we have on the order of 400 live HPC projects at any one time, and I would say there's at least 4000, or maybe 8000 or 12,000 ad hoc debuggers that people build to get their work done. And then Project 4001 comes along and says, "How come there's no debugger? I'd better build one." I think we can do better than that. I think there's a tremendous amount of capability that we can accumulate and capitalize on and invest in.
There are a lot of software libraries, both in terms of technology and experience. NASA has Cosmic, NSF has the supercomputing centers, DoD has the Army Rapid repository, DARPA is building a Stars software repository capability, and so on. There's networking technology, accesscontrol technology, file and database technology, and the like, which could support aggregating these libraries.
The hardware vendors have user communities that can accumulate software assets. The third-party software vendor capabilities are really waiting for somebody to aggregate the market so that it looks big enough that they can enter.
Application communities build a lot of potentially reasonable software. The research community builds a tremendous amount just in the process of creating both machines, like Intel's iWarp, and systems, like Nectar at Carnegie Mellon University; and the applications that the research people do, as Gregory McRae of Carnegie Mellon will attest, create a lot of good software.
So what kind of a capability could we produce? Let's look at it from a user's standpoint.
The users ought to have at their workstations a capability to mouse and window their way around a national distributed set of assets and not have to worry about where the assets are located or where the menus are located. All of that should be transparent so that users can get access to things that the FCCSET HPC process invests in directly, get access to various user groups, and get access to software vendor libraries.
There are certain things that they, and nobody else, can get access to. If they're with the Boeing Company, they can get the Boeing airframe software, and if they're in some DoD group that's working low
observables, they can get access to that. But not everybody can get access to that.
As you select one of these categories of software you're interested in, you tier down the menu and decide that you want a debugging tool, and then you go and look at what's available, what it runs on, what kind of capabilities it has, etc.
Behind that display are a lot of nontrivial but, I believe, workable issues. Stimulating high-quality software asset creation is a nontrivial job, as anybody who has tried to do it knows. I've tried it, and it's a challenge.
An equally hard challenge is screening out low-quality assets—sort of a software-pollution-control problem. Another issue is intellectual property rights and licensing issues. How do people make money by putting their stuff on this network and letting people use it?
Yet another issue is warranties. What if the software crashes right in the middle of some life- or company- or national-critical activity?
Access-control policies are additional challenges. Who is going to access the various valuable assets? What does it mean to be "a member of the DoD community," "an American company," or things like that? How do you devolve control to your graduate students or subcontractors?
Distributed-asset management is, again, a nontrivial job. You can go down a list of additional factors. Dave Nelson mentioned such things as interface standards during his presentation in this session, so I won't cover that ground again except to reinforce his point that these are very important to good software reuse. But I think that all of these issues are workable and that the asset base benefits are really worth the effort. Right now one of the big entry barriers for people using HPC is an insufficient software asset base. If we lower the entry barriers, then we also get into a virtuous circle, rather than a vicious circle, in that we increase the supply of asset producers and pump up the system.
The real user's concern is reducing the calendar time to solution. Having the software asset base available will decrease the calendar time to solution, as well as increase application productivity, quality, and performance.
A downstream research challenge is the analog to spreadsheets and fourth-generation languages for high-performance applications. These systems would allow you to say, "I want to solve this particular structural dynamic problem," and the system goes off and figures out what kind of mesh sizes you need, what kind of integration routine you should use, etc. Then it would proceed to run your application and interactively present you with the results.
We at DARPA have been interacting quite a bit with the various people in the Department of Energy, NASA, and NSF and are going to try to come up with a system like that as part of the High Performance Computing Initiative. I would be interested in hearing about the reactions of other investigators to such a research program.