White Paper Submission Research Directions for the Next Generation Internet Open Research Problems on Scale and Stability Steve Deering Cisco Systems 170 West Tasman Drive San Jose, CA 95134-1706 415-321-0224 deering@cisco.com Allison Mankin USC/ISI 4350 N. Fairfax Drive Suite 620 Arlington VA 22203 mankin@isi.edu Voice: 703-807-0132 Fax: 703-812-3712 Lixia Zhang UCLA Computer Science Department 4531G Boelter Hall Los Angeles, CA 90095-1596 lixia@cs.ucla.edu Voice: 310-825-2695 ------------------------------------------------------- White Paper Submission Research Directions for the Next Generation Internet Open Research Problems on Scale and Stability Moving the Internet to gigabit networking on the large scale, and making the network safe and secure for mission-critical and commercial use are two of the grand challenges for the Next Generation Internet. Both issues have been discussed extensively and pursued actively. In this white paper, however, we wish to highlightaspects of them, which have received less attention, but which we believe are equally critical to the success of NGI. These are: 1. the scalability of network routing and addressing to support a very large number (in the limit defined by RFC1726, trillions) of systems connecting through the NGI IP infrastructure. 2. the system stability commensurate with the high value of the network to its users, against challenges from partial failures and disasters. 3. the network management to control this system, which is arguably larger and more distributed than any before it. SCALABILITY OF ROUTING AND ADDRESSING The basic building block in the original Internet architecture design is datagram delivery. The network forwards packets as fast as possible, without knowledge of individual applications; (most) end systems make use of the network with no knowledge of the network's internals. This clear separation of functionality between the network and the end systems has enabled the Internet exponential growth. However, growth has pushed some other parts of the original design to their limits, namely the IP address space and IP routing system. Constraints on IP address allocation have become a dominant issue in US interactions with Internet Service Providers. We believe the design and development of IPv6 has provided an adequate response to this crisis, and we encourage deployment of IPv6 addressing soon, lest the immediate future's growth result in severe limitations on future growth, through over- dependence on network address translator and proxying technologies. The scaling of the routing system remains a great challenge. Today's routing table size has reached 45,000 entries and continues to go up rapidly, with no stopping point in sight. Although CIDR made a positive effect on slowing down the table growth, the continuing aggregation and routing table control in CIDR depends on address leasing and forced client renumbering. The original IP design used and gained significantly from stably allocated IP addresses as end-system identifiers. Client resistance to the forced renumbering required for CIDR has renewed attention to the value of address stability. Alternatives are to develop stabilized addressing designs (metro) or to instill usefully scaled cryptographic identifiers into all systems in some scalable fashion. The research problem is important and strongly open. There is no sign that the Internet growth rate is slowing down; the challenge in this research area is how we will design for the orders of magnitude of growth in user population, traffic volume, and number of sites online of the NGI at the 1000x and higher scales. STABILITY AND ROBUSTNESS There have been many publicized large-scale network outages recently, some due to configuration errors and miscoordination (e.g. the AOL outage last summer), some due to system vulnerability (e.g. the TCP SYN attack in New York area), some due to transient overload (e.g. University of Washington multi-day unreachability after Microsoft's IE 3.0 release), and some due to natural disaster and lack of emergency planning. All in all, today's Internet services seem vulnerable, even in the absence of human-made and nature-made crises. (This is not to say the Internet is vulnerable to collapse). As more and more mission-critical applications get to the Internet, lack of robust network connectivity could be become lethal. We consider robust connectivity and stability the baseline QoS requirement. Stable access of an end-system to its intended peers along with security mechanisms are the irreducible elements of QoS. Paul Baran summarized the basic approach to robust systems 30 years ago in his fundamental work on packet switching: a system with adequate physical redundancy and self-adaptivity. The challenge in this area is how to evolve the Internet towards extensive redundancy and self- configuration. The advancement of internet multicast technology is key to this area. In the internet multicast semantic, peers in arbitrary groupings can develop extremely survivable communications among themselves. On this we can build larger auto-configuring and redundancy-exploiting systems. SCALABILITY OF MANAGEMENT AND CONTROL Increasing scale fundamentally challenges network management. The primary network management tool in use today is SNMP, the Simple Network Management Protocol. SNMP was designed to handle network nodes individually and in isolation. It assumes that the invoking client, whether a human operator or a program, has the detailed knowledge of all network elements. Recent advances in network management have taken one step forward from SNMP, relieving some human burden through the use of intelligent network agents that go out and collect network operational information. This step has not changed the basic design in which individual nodes are the unit of network management functions; the agent visits and investigates the state at one node at a time. At the scales of the NGI, management tools must enable the network to be self-controlled among collections of its elements. Humans should have powerful views into it, but it should not require node-by-node configurations. A statement by Chuck Thacker (Digital Research Laboratories) captures our long-term issue here: "the net has grown to so large in size that manual management is just beyond human reach. The most we can do is to make rules, and leave the net to configure and manage itself." Internet multicast is a technology we see as as key to this open area, again because of the organizational power it may provide. IPv6 addressing may prove rich for development of autoconfiguration modes. SUMMARY We consider the issues of scalable routing and addressing, secured and stable operation, and scalable management and control equally important to the breakthroughs in capacity needed for the NGI. Only with some breakthrough results in these areas will it be possible to accomplish the science, education and social missions that the NGI White Paper has identified. We indicate tools for working on the significant open problems in these areas: IPv6 engineering and research based on IPv6; internet multicast systems research; and large model development.