------------------------------------------------------ Cover Page: ------------------------------------------------------ White Paper on Mechanisms for Unresponsive Traffic for the Workshop on Research Directions for the Next Generation Internet by Bob Braden, Sally Floyd, and Greg Minshall -------------- Sally Floyd Staff Scientist Lawrence Berkeley National Laboratory MS 50B-2239 One Cyclotron Road Berkeley CA 94720 Phone: 510-486-7518 Fax: 510-486-6363 Email: Floyd@ee.lbl.gov --------------- Bob Braden USC Information Sciences Institute 4676 Admiralty Way Marina del Rey, CA 90292 Phone: 310-822-1511 Email: Braden@ISI.EDU ---------------- Greg Minshall Ipsilon Systems 232 Java Drive Sunnyvale, CA 94089 Email: Minshall@ipsilon.com ------------------------------------------------------ The White Paper: ------------------------------------------------------ White Paper on Mechanisms for Unresponsive Traffic Bob Braden, Sally Floyd, and Greg Minshall This white paper presents a recommendation that originated in discussions in the End-to-End Research Group of the Internet Research Task Force (IRTF), and is discussed in a recent internet draft on "Recommendations on Queue Management and Congestion Avoidance in the Internet" [B97]. The second of two recommendations in that document recommends the following: "It is urgent to begin or continue research, engineering, and measurement efforts contributing to the design of mechanisms to deal with flows that are unresponsive to congestion notification or are responsive but more aggressive than TCP." Most of the rest of this white paper is excerpted from that internet draft, and discusses the urgency of the need for research in this area. We do not in this white paper discuss the range of possible solutions, except to note that one possible approach is the use of Fair Queueing [Demers90] or similar scheduling algorithms that directly manage the bandwidths used by each flow. Research is also in progress on alternate mechanisms to detect and restrict the bandwidth of unresponsive or high-bandwidth flows [FF97] at the routers, for routers with Random Early Detection (RED) queue management [FJ93]. The discussion in this paper applies to "best-effort" traffic. The Internet integrated services architecture, which provides a mechanism for protecting individual flows from congestion, introduces its own queue management and scheduling algorithms [Shenker96, Wroclawski96]. o The Role of End-to-End Congestion Control in the Internet The Internet protocol architecture is based on a connectionless end- to-end packet service using the IP protocol. The advantages of its connectionless design, flexibility and robustness, have been amply demonstrated. However, these advantages are not without cost: careful design is required to provide good service under heavy load. In fact, lack of attention to the dynamics of packet forwarding can result in severe service degradation or "Internet meltdown". This phenomenon was first observed during the early growth phase of the Internet of the mid 1980s [Nagle84], and is technically called "congestion collapse". The original fix for Internet meltdown was provided by Van Jacobson. Beginning in 1986, Jacobson developed the congestion avoidance mechanisms that are now required in TCP implementations [Jacobson88, HostReq89]. These mechanisms operate in the hosts to cause TCP connections to "back off" during congestion. We say that TCP flows are "responsive" to congestion signals (i.e., dropped packets) from the network. It is primarily these TCP congestion avoidance algorithms that prevent the congestion collapse of today's Internet. Because TCP "backs off" during congestion, a large number of TCP connections can share a single, congested link in such a way that bandwidth is shared reasonably equitably among similarly situated flows. The equitable sharing of bandwidth among flows depends on the fact that all flows are running basically the same congestion avoidance algorithms, conformant with the current TCP specification [HostReq89, RFC2001]. However, that is not the end of the story. Considerable research has been done on Internet dynamics since 1988, and the Internet has grown. It has become clear that the TCP congestion avoidance mechanisms, while necessary and powerful, are not sufficient to provide good service in all circumstances. In particular, there is a potential for future congestion collapse of the Internet due to flows that are unresponsive, or not sufficiently responsive, to congestion indications. There is no consensus solution in the research cummunity to controlling congestion caused by such aggressive flows, and significant research and engineering will be required before any solution will be available. It is imperative that this work be energetically pursued, to ensure the future stability of the Internet. It is convenient to divide flows into three classes: (1) TCP- compatible flows, (2) unresponsive flows, i.e., flows that do not slow down when congestion occurs, and (3) flows that are responsive but are not TCP-compatible. We use the term "TCP-compatible" for a flow that behaves under congestion like a flow produced by a conformant TCP. A TCP-compatible flow is responsive to congestion notification, and in steady-state it uses no more bandwidth than a conformant TCP running under comparable conditions (drop rate, RTT, MTU, etc.) The last two classes of flows contain more aggressive flows that pose significant threats to Internet performance, as we will now discuss. Open issues about the appropriate granularity of a "flow" are addressed further in [B97]. o Non-Responsive Flows There is a growing set of UDP-based applications whose congestion avoidance algorithms are inadequate or nonexistent (i.e, the flow does not throttle back upon receipt of congestion notification). Such UDP applications include streaming applications like packet voice and video, and also multicast bulk data transport [SRM96]. If no action is taken, such unresponsive flows could lead to a new congestion collapse. In general, all UDP-based streaming applications should incorporate effective congestion avoidance mechanisms. For example, recent research has shown the possibility of incorporating congestion avoidance mechanisms such as Receiver- driven Layered Multicast (RLM) within UDP-based streaming applications such as packet video [McCanne96; Bolot94]. Further research and development on ways to accomplish congestion avoidance for streaming applications will be very important. However, it will also be important for the network to be able to protect itself against unresponsive flows, and mechanisms to accomplish this must be developed and deployed. Deployment of such a mechanism would provide incentive for every streaming application to become responsive by incorporating its own congestion control. o Non-TCP-Compatible Transport Protocols The second threat is posed by transport protocol implementations that are responsive to congestion notification but, either deliberately or through faulty implementations, are not TCP- compatible. Such applications can grab an unfair share of the network bandwidth. For example, the popularity of the Internet has caused a proliferation in the number of TCP implementations. Some of these may fail to implement the TCP congestion avoidance mechanisms correctly because of poor implementation. Others may deliberately be implemented with congestion avoidance algorithms that are more aggressive in their use of bandwidth than other TCP implementations; this would allow a vendor to claim to have a "faster TCP". The logical consequence of such implementations would be a spiral of increasingly aggressive TCP implementations, leading back to the point where there is effectively no congestion avoidance and the Internet is chronically congested. o The Need for Further Research The projected increase in more aggressive flows of both these classes, as a fraction of total Internet traffic, clearly poses a threat to the future Internet. There is an urgent need for measurements of current conditions and for further research into the various ways of managing such flows. There are many difficult issues in identifying and isolating unresponsive or non-TCP-compatible flows at an acceptable router overhead cost. Finally, there is little measurement or simulation evidence available about the rate at which these threats are likely to be realized, or about the expected benefit of router algorithms for managing such flows. References: [Bolot94] Bolot, J.-C., Turletti, T., and Wakeman, I., Scalable Feedback Control for Multicast Video Distribution in the Internet, ACM SIGCOMM '94, Sept. 1994. [B97] B. Braden, D. Clark, J. Crowcroft, B. Davie, S. Deering, D. Estrin, S. Floyd, V. Jacobson, G. Minshall, C. Partridge, L. Peterson, K. Ramakrishnan, S. Shenker, J. Wroclawski, L. Zhang, Recommendations on Queue Management and Congestion Avoidance in the Internet. Internet draft (work in progress), March 1997. [Demers90] Demers, A., Keshav, S., and Shenker, S., Analysis and Simulation of a Fair Queueing Algorithm, Internetworking: Research and Experience, Vol. 1, 1990, pp. 3-26. [FF97] Floyd, S., and Fall, K., Router Mechanisms to Support End-to-End Congestion Control. Technical report, February 1997. URL ftp://ftp.ee.lbl.gov/papers/collapse.ps. [FJ93] Floyd, S., and Jacobson, V., Random Early Detection gateways for Congestion Avoidance, IEEE/ACM Transactions on Networking, V.1 N.4, August 1993, pp. 397-413. Also available from http://ftp.ee.lbl.gov/floyd/red.html. [HostReq89] R. Braden, Ed., Requirements for Internet Hosts -- Communication Layers, RFC-1122, October 1989. [Jacobson88] V. Jacobson, Congestion Avoidance and Control, ACM SIGCOMM '88, August 1988. [McCanne96] McCanne, S., Jacobson, V., and M. Vetterli, Receiver-driven Layered Multicast, ACM SIGCOMM 96. [Nagle84] J. Nagle, Congestion Control in IP/TCP, RFC-896, January 1984. [RFC2001] W. Stevens, TCP Slow Start, Congestion Avoidance, Fast Retransmit, and Fast Recovery Algorithms, RFC 2001, January 1997 [Shenker96] Shenker, S., Partridge, C., and Guerin, R., Specification of Guaranteed Quality of Service, IETF Integrated Services Working Group, Internet draft (work in progress), August 1996. [SRM96] Floyd. S., Jacobson, V., McCanne, S., Liu, C., and L. Zhang, A Reliable Multicast Framework for Light-weight Sessions and Application Level Framing. ACM SIGCOMM '96, pp 342-355. [Wroclawski96] J. Wroclawski, Specification of the Controlled-Load Network Element Service, IETF Integrated Services Working Group, Internet draft (work in progress), August 1996.