Towards a Computation-Rich NGI Ian Foster, Rick Stevens Mathematics and Computer Science Division Argonne National Laboratory Argonne, IL 60439 {foster,stevens}@mcs.anl.gov Tel: 630 252 4619 Fax: 630 252 5986 The development and deployment of a Next Generation Internet will enable not only more widespread use of today's applications, such as videoconferencing, but also new classes of applications. We argue that many of those new applications will be concerned with computation as much as communication, as supercomputers, scientific instuments, databases, etc., become accessible outside the supercomputer centers and research laboratories where they were previously confined. For the first time, it will become possible to construct networked virtual supercomputers that integrate supercomputers, large databases, archival storage devices, advanced visualization devices, and/or scientific instruments. New classes of desktop tools (tomorrow's spreadsheet?) will be enabled that draw on remote computational resources to enhance dramatically the computational capabilities accessible to the ordinary user. The term "computational grid" is suggestive of what these new capabilities may mean. The power grid changed dramatically almost every aspect of our lives by providing universal access to electric power; a computation-rich NGI may well enable similar benefits by providing ubiquitous access to computing. However, multiple hard research problems must be addressed before a computational grid can be used in a routine fashion. Furthermore, these problems are in many cases distinct from those associated with communication-oriented applications such as videoconferencing. We argue that a research and development agenda for the NGI must address these problems if the NGI is to have the flexibility required for computation-rich applications. NEED FOR APPLICATION DRIVERS We require application drivers to help define the new technologies needed for a computation-rich NGI. Defining and developing these application drivers is a hard problem. Many future applications are likely not yet imagined. Today's network computing efforts are typically crippled by the limited capabilities of today's Internet. The 1995 I-WAY effort is representative of what we believe is a good approach to resolving these difficulties: focus efforts on high-end applications (because these are more likely to typify tomorrow's requirements) and create environments that enthuse and hence involve the best people. The I-WAY connected supercomputers and other resources at 17 different sites across North America, saw 60 groups develop applications in many different areas, with the integration of computation and networking serving as a unifying theme. The following are four illustrative examples: * Collaborative engineering: Lori Freitag and her colleagues at ANL and Nalco Fueltech demonstrated a collaborative engineering system that uses virtual reality techniques to allow engineers at multiple sites to interact with each other and with a simulation of an industrial boiler. * Smart instruments: Craig Lee and his colleagues at the Aerospace Corporation and Caltech demonstrated a system that uses compute resources acquired dynamically on the network to perform real-time cloud detection and visualization of data obtained from a meteorological satellite. * Large-scale computation: Jarek Nieplocha and his colleagues at PNNL demonstrated a distributed quantum chemistry modeling system, that exploited the computational power of three large IBM SP multicomputers to perform ab initio calculations for large organic molecules. * Remote visualization: Bill Hibbard and his colleagues at Wisconsin and ANL demonstrated a geophysical modeling and visualization application in which a supercomputer with a large internal memory and attached tertiary storage is used as a data server for remote visualization clients. Experience with these realistic high-end applications has yielded valuable insights into the technologies required for future high-performance networked computing systems. NEED FOR TECHNOLOGY DEVELOPMENT Today's networked applications are clumsy to develop, configure, and run: the user must operate as a hunter-gatherer, painfully searching the net for required compute and data resources. This scenario is impractical for all but the most dedicated user. The applications revolution will happen only when users can acquire and use distributed resources much as they do other goods today: via middlemen, agents, or other entities that allow resources to be located, acquired, combined, and used without concern for their location or other specifics. The following are just a few of the new technologies required if this revolution is to occur: * Resource location and scheduling mechanisms that can match user requirements with resources, and coschedule networks, computers, and other resources. * Network-aware programming tools that can translate application-level requirements for resources and quality of service into lower-level network parameters, and vice versa. * New algorithms and application structures that can operate efficiently in heterogeneous and dynamic network environments. * Uniform authentication, authorization, and licensing mechanisms that integrate computation as well as network and data resources. * Testbeds in which application, middleware, and network issues can be investigated in an integrated fashion Together, these and other software and hardware infrastructure components combine to create an environment in which we can explore and demonstrate, on a continuous basis, the new classes of computation-rich applications that will be enabled by the NGI. Over time, hardware testbeds and experimental software transition to the production environment (as was the case with the ARPAnet), achieving technology transfer to the larger community.