NGI: Case for End-to-End Quality of Service John Bruno, Eran Gabber, Banu Ozden and Abraham Silberschatz Bell Laboratories 600 Mountain Ave. Murray Hill, NJ 07974 {jbruno, eran, ozden, avi}@research.bell-labs.com The Next Generation Internet (NGI) initiative will accelerate the interest in applications such as video conferencing, digital libraries, telemedicine, distance learning, distributed workgroups, telecommuting, entertainment, etc. In addition to anticipated advances in communication, corresponding advances in computing and storage technologies are also stimulating the invention of new real-time services and multimedia applications. Our themes in this white paper are that (1) support for "End-to-End Quality of Service" guarantees is a critical ingredient in realizing the benefits of the accelerating technological advances. (2) operating system support for Quality of Service (QoS) guarantees is an essential component in achieving end-to-end QoS. In order to achieve end-to-end QoS, all resources required by applications, namely, networks, servers, and end-users' machines, must support QoS. We believe that the bandwidth requirement of new multimedia applications will keep pace with the bandwidth increase in communication, computatition and storage resources, and as a result, even if the resource bandwidths increase by several orders of magnitude, there will be need to schedule ``scarce'' resources and thus support QoS. We are beginning to see the emergence of QoS guarantees in data communication networks. ATM networks provide certain QoS guarantees to certain classes of traffic in an integrated services environment and some manufacturers are beginning to provide limited QoS in IP routers through the identification of ``flows'' and output link scheduling techniques. Standards for multicast communications (MBONE), RSVP, RTP, and other protocols are also pacing the need for end-to-end QoS. Much work has been done in supporting QoS in servers and networks. However, these QoS guarantees are of little use if they cannot be extended to the endpoint applications via operating system support for the corresponding QoS parameters. Furthermore, QoS enabled operating systems are applicable not only to end-user's machine, but also to service providers' computing resources and intermediate nodes (i.e., network components such as routers and firewalls). Much less work has been done in the context of QoS support in operating systems as compared to networking. The desire to support multiple real-time applications on a single platform requires that the operating system has the ability to divide the system resources among applications in a manner that achieves the desired levels of predictable performance. Current multiprogrammed operating systems do not provide QoS guarantees since the performance of a single application is, in part, determined by the overall load on system resources. As a result, many users prefer to use stand-alone systems with limited dependency on shared resources to achieve some semblance of QoS by indirectly controlling the system workload. The emergence of support for QoS in networks is based, in part, on research regarding the scheduling of ``connections'' (a connection is a series of related packets, sometimes called a flow) on a communication link [dks:sigcomm89,z:sigcomm90,g:infocom94,z:proc95,ggv:osdi96]. The QoS parameters of interest for link scheduling include fairness, throughput, delay and delay jitter, where fairness and throughput are ``connection- based'' performance parameters, and delay and delay jitter are ``packet- based.'' In order to support QoS for applications, their counterparts in operating systems must be supported. Although "real-time" operating systems can be used to provide some of the QoS parameters (e.g., delay), they have limited applicability. This is because schedulers used in real-time operating systems typically require a priori knowledge of applications' resource consumption, periodicity and/or deadlines. Furthermore, real-time operating systems require a different programming interface than the conventional ones. QoS support in operating systems has the potential to provide users and applications with a predictable computing environment (e.g., delay bounds, throughput) and with control over their computing environments (e.g., selection of a suitable computing environment and provisioning it). Furthermore, this can be achieved without the need to modify the applications or a priori knowledge about applications' behavior. A major problem in providing QoS guarantees in operating systems is managing multiple resources that the operating system controls, such as CPU, disks, memory, I/O bus, and network interfaces. The problem is complicated due to the fact that (1) each application may require multiple resources during its execution, (2) there may be no a priori knowledge about resource requirements of applications, and (3) different resources may have different characteristics (e.g., preemptable vs. non-preemptable). Some of these issues are addressed in the Eclipse operating system [bgos:bl97], which we are currently developing at Bell Labs, as well as in [Mercer94,ggv:osdi96,ww:mit95]. However, more research is required to provide a "true" OoS support in operating systems. REFERENCES dks:sigcomm89: A. Demers and S. Keshav and S. Shenker, ``Design and Analysis of a Fair Queuing Algorithm'', Proceedings of SIGCOMM'89. z:sigcomm90: L. Zhang, ``Virtual Clock: A New Traffic Control Algorithm for Packet Switching Networks'', Proceedings of SIGCOMM'89. g:infocom94: J. Golestani, ``A Self-Clocked Fair Queueing Scheme for Broadband Applications'', Proceedings of INFOCOM'94. z:proc95: H. Zhang, "Service Disciplines For Guaranteed Performance Service in Packet- Switching Networks", Proceedings of the IEEE, oct. 1995. ggv:osdi96: P. Goyal and X. Guo and H. M. Vin, ``A Hierarchical CPU Scheduler for Multimedia Operating Systems'', Proceedings of OSDI'96. bgos:bl97: J. Bruno and E. Gabber and B. Ozden and A. Silberschatz, "The Eclipse Operating System: Providing Quality of Service via Reservation Domains", Information Sciences Research Center, Bell Labs, Murray Hill, NJ", BL0112330-970312-08TM. mercer94: C. Mercer and S. Savage and H. Tokuda, "Processor Capacity Reserves: Operating System Support for Multimedia Applications", Proceedings of IEEE International Conference on Multimedia Computing and Systems", 1994. ww:mit95: C. A. Waldspurger and W. Weihl, "Stride Scheduling: Deterministic Proportional-share Resource Management", MIT, Laboratory for Computer Science, TM-528.