End-to-End, Qos-Driven Resource Management for the Next-Generation Internet --------------------------------------------------------------------------- Authors: Jaroslaw Sydir, Research Engineer EJ314 SRI International 333 Ravenswood Ave. Menlo Park, CA 94025 Phone (415) 859-4552 Fax (415) 859-4812 Email: sydir@erg.sri.com Bikash Sabata, Computer Scientist EK386 SRI International 333 Ravenswood Ave. Menlo Park, CA 94025 Phone (415) 859-2281 Fax (415) 859-4812 Email: sabata@erg.sri.com Saurav Chatterjee, Research Engineer EK360 SRI International 333 Ravenswood Ave. Menlo Park, CA 94025 Phone (415) 859-3594 Fax (415) 859-4812 Email: saurav@erg.sri.com In recent years there has been a growing need for applications that require real-time and other quality-of-service (QoS) guarantees, to be able to run on the Internet. The advent of high-speed networks and processors has made possible a variety of widely distributed applications. However, the lack of real-time services and guarantees in the Internet has limited the large-scale deployment of such applications. These applications have end- to-end QoS requirements that can be expressed in terms of their timing, and the precision and accuracy of the results that they produce. For example, the user of a videoconference application is interested in the quality of the picture (the end result), which is an aggregation of the QoS of capturing the video, compressing it, transmitting it over the network, decompressing it, and displaying it. Distributed applications use computing, communication, and storage resources. Furthermore, a move towards network computers, appliance computers, and mobile computers is shifting the computing and storage requirements of applications away from end-user machines and towards larger computing and storage servers. In this paradigm, not only the communications resources, but also the computing and storage servers are shared among the many users. To utilize all of these shared resources efficiently, and to provide end-to-end QoS to applications, these shared resources must be managed in a coordinated manner. We therefore propose that the Next-Generation Internet (NGI) should be viewed not as a communication infrastructure comprising a network of networks, but rather as a system of distributed systems. Providers of the NGI should offer not only communication services, but also computing and storage services. Whereas the current Internet seamlessly connects individual networks into a network of networks, the NGI should seamlessly connect individual distributed systems into a system of distributed systems. To achieve this objective for the NGI, a new architecture is required that comprises a set of protocols and algorithms for the coordinated management of all system resources so as to provide the required end-to-end QoS support. The Telecommunications and Distributed Processing Group at SRI International is developing such an architecture based on this vision [1]. A comprehensive resource management strategy requires that the new architecture take into account the perspectives of (1) The applications, which understand only their resource requirements as a function of the QoS requirements (2) The individual resources, which may have individual scheduling policies (3) The system, which has policies for resolving the potentially conflicting requirements of various applications competing for system resources. Our architecture consists of (1) A taxonomy for defining QoS specifications (2) Models for applications, resources and the system (3) A suite of resource management algorithms that manage the system of distributed systems. The algorithms must simultaneously consider the objectives and constraints of applications, resources, and the system in making resource management decisions. We believe that an architecture of this type should be adopted by the NGI, to support the increasing number of real-time applications that will execute on the NGI. In the remainder of this paper we discuss how the basic architecture on which today's Internet is based must change, to meet this goal of end-to-end QoS support. Modeling Applications, Resources, and the System ---------------------------- --------------------- When making decisions, the resource management system, which is the core of our architecture, must consider the (frequently conflicting) objectives of applications, resources, and the system. Unlike current network traffic models, our application model [2] captures the end-to-end structure of an application, including its demands on computing, storage, and communication resources. The model also captures the end-to-end QoS requirements and preferences of the application. The resource model uniformly models the computing, storage, and communication resources and their scheduling attributes. The system model [3] models the system as a collection of distributed systems and their attributes. Like the current Internet, the model captures this information in a hierarchical manner, which enables each subsystem to maintain administrative control over its resources and possess its own resource management scheme. We have also defined a QoS interface that enables different subsystems to communicate with each other. QoS Translation, Scheduling, Allocation, and Routing ----------------------- ---------------------------- To provide end-to-end QoS support, resource management must be pervasive within each system layer of the NGI. Because the NGI will be a collection of individual distributed systems with individual administrative controls and resource management schemes, a QoS translation algorithm is required that will recursively translate top-level application QoS requirements into QoS requirements for lower-level subsystems. QoS translation will enable the NGI to coordinate the control of disparate distributed systems when providing end-to-end QoS to applications. QoS translation is based on a QoS taxonomy that we have developed [4] to describe the needs of the applications to the system. End-to-end QoS requirements of the application are decomposed to the QoS requirements of the individual application components. Research is needed to understand how the QoS achieved by the individual application components aggregates into end-to-end application QoS. Each resource management scheme executes its own allocation, scheduling, and adaptivity algorithms according to the QoS requirements provided by QoS translation. Unlike the current Internet, the NGI will need an integrated allocation and routing algorithm because it must allocate shared computing, storage and communication resources to each application [5]. Similarly, network-level flow control will not suffice - instead, the NGI will require end-to-end flow control over all resources [6]. Additionally, if the system ever needs to degrade application QoS in response to resource failures or security attacks, it must do so gracefully and in an end-to-end system wide manner. Conclusion -------------- The design of the NGI will give us an unique opportunity to reevaluate the design of the Internet and propose fundamental changes in the Internet paradigm. As we have noted above, we believe that the NGI should be viewed as a collection of distributed computing systems instead of as a network of networks. The potential of the NGI will be fully realized only by the development of an architecture that supports QoS guarantees for the applications by coordinating the management of the NGI's communication, computing, and storage resources. Although the near-term focus of the designers of the NGI may be on the development of a high-speed communication backbone network, we believe that an architecture of the type that we propose must be adopted from the beginning, to facilitate a graceful evolution of the NGI as it reaches closer and closer to the end users. References ---------- [1] Sydir, J., S. Chatterjee, B. Sabata, and T. Lawrence. 1997 (in preparation). "An Adaptive QoS-driven Resource Management Architecture for Distributed Systems," draft technical report, SRI International, Menlo Park, California. [2] Chatterjee, S., J. Sydir, B. Sabata, M. Davis, and T. Lawrence. 1997 (in preparation). "Modeling Distributed Applications with QoS Requirements," draft technical report, SRI International, Menlo Park, California. [3] Sabata, B., S. Chatterjee, J. Sydir, and T. Lawrence. 1997 (in preparation). "Hierarchical Modeling of Systems for QoS-Based Distributed Resource Management," draft technical report, SRI International, Menlo Park, California. [4] Sabata, B., S. Chatterjee, M. Davis, J. Sydir, and T. Lawrence. 1997. "Taxonomy for QoS Specifications," Proc. IEEE WORDS '97, (February). [5] Chatterjee, S. 1997. "A Quality of Service Based Allocation and Routing Algorithm for Distributed, Heterogeneous Real Time Systems,"Proc. 17th IEEE International Conference on Distributed Computing Systems, Baltimore, Maryland, (May). [6] Chatterjee, S. and J. Strosnider. 1995. "Distributed Pipeline Scheduling: A Framework for Distributed, Heterogeneous Real-Time System Design," Computer Journal (British Computer Society), Vol. 38, No. 4.