Global Information Networks

Jon Kleinberg

kleinbergJon Kleinberg is on the faculty of the Computer Science Department at Cornell University, where he holds the position of Tisch University Professor. His research focuses on issues at the interface of networks and information, with an emphasis on the social and information networks that underpin the Web and other on-line media. He is a member of the National Academy of Engineering and a fellow of the American Academy of Arts and Sciences. He serves on the Computer and Information Science and Engineering (CISE) Advisory Committee of the National Science Foundation, and the Computer Science and Telecommunications Board (CSTB) of the National Research Council. He is the recipient of a MacArthur Foundation Fellowship, a Packard Foundation Fellowship, and a Sloan Foundation Fellowship, the Nevanlinna Prize from the International Mathematical Union, and the National Academy of Sciences Award for Initiatives in Research. [ More ]

 



Back to Main Page

Watch the Talk (20:11)

Download Slides (768 Kb)

Download (340 Mb)

 

Summary

kleinberggrphOver the past decade, ideas from computer science have offered new ways of thinking about the social networks and social processes that link people together. Rapidly proliferating on-line networks such as e-mail, instant messaging, social networking sites, blogs, and the World Wide Web have increasingly blurred the distinction between social and technological networks. They have provided individuals unprecedented opportunities for maintaining existing social ties, and for establishing new ties that cross traditional geographic, social and cultural boundaries. They allow individuals from various backgrounds, with diverse intentions or qualifications, to participate in public discussion. At the same time, these new media create records of interactions and activities, providing researchers from the social sciences, from computer science, and from other disciplines with an unprecedented wealth of data on human interactions. They provide the opportunity for major advances in the social sciences, using ideas from computation.

As a result, we have seen an increasing confluence of social sciences and computer science. Among the questions falling within the purview of computer science are these: Can we detect patterns of action and interaction from large traces of human behavior? Can we build mathematical models to help explain these patterns? And what do these models tell us about the way systems should be designed for large populations of users?

We can extract patterns from very basic human interactions. For example, we consider the density of locations for which pictures have been uploaded to the popular photo-sharing site Flickr. Not only can we extract a workable map, with main roads, coastlines and a good idea of population densities; we can correctly infer labels such as city names or the names of monuments and other sightseeing destinations. Using methods from computer vision, we can select "characteristic" views for these locations.

As a second example, we can analyze text on blogs and news sites over a period of time to find individual words or phrases that appear particularly frequently. Without any supervision or domain knowledge, we recover a surprisingly accurate timeline of important political developments. The source for this timeline is nothing but a trace of the behavior of large numbers of individuals, in this case, bloggers and news writers; when analyzing their collective behavior, we see the emergence of patterns that teach us about the time-varying interests of society as a whole.

As a third example, we can examine how a computational perspective has led to new inter¬pretations of a classic experiment in social psychology, and in turn has suggested new ideas for the organization of both social networks and computer networks. In the 1960's, Stanley Milgram and colleagues aimed to understand the path lengths in the network of friendships in the United States. Individuals in Nebraska were asked to forward a letter to a friend of Milgram's near Boston. The letter could only be forwarded to one person, and only to a friend of the current holder of the letter. The completed chains have surprisingly small lengths, with a median length of 6 hops. From this experiment, the phrase "6 degrees of separation" entered pop culture. Recently, Leskovec et al. analyzed the graph of Microsoft Instant Messenger contacts, and found similar distributions of distances on a much larger network.

The choice made by Milgram to have individuals forward the letter only to one person at a time raises an interesting set of questions: Even if these short paths exist in social networks, how would individuals with their myopic view of the network be able to find them? What are the clues by which individuals make their choices of to whom to forward the letter? What properties must the friendship network exhibit for such simple choices to be successful?

This type of question is a significant departure from a purely static analysis of social networks. We are not asking "what?" but "how?" - an inherently algorithmic question. An important contribution of computer science is to look at the world in algorithmic terms, and ask algorithmic questions. Indeed, Milgram asked individuals about their decision-making process, and found that the two most important criteria were geographic distance and similarity in professions. That raises the following concrete question: What does it tell us about a social network if myopic forwarding based solely on geographic or social distance succeeds in identifying short chains?

An analysis of simple mathematical models for social networks embedded in physical space suggests a qualitative answer to this question: It takes the right balance of links at different distance scales. Roughly speaking, to enable successful forwarding, each individual should have about the same number of friends at distances between 1-10, 10-100, 100-1000, 1000-10000, and so on. In other words, even though there are many more people at larger distances, each distance scale occupies roughly the same attention for each individual. The forwarding chain looks much like the routing in explicitly designed systems such as the US Postal System: a letter is first forwarded to the right country, then the right city, right block, right building, and finally the correct office. Yet the USPS routing system is carefully designed and maintained at great cost, whereas human friendship patterns have evolved naturally.

The examples illustrate two complementary contributions of computer science to the social sciences: the design of systems at the boundary of social and technological networks, and the modeling of social systems in light of their algorithmic behavior. Both shed new light on fundamental questions in the social sciences, but they also raise many new questions and concerns. Why do social processes produce the outcomes they do? Communication technology itself is changing the way in which people interact - how do the patterns of interaction change as a result of technological innovation? Will geographic location lose its high correlation with friendships in light of online social networks? How can we protect the privacy of individuals? Do we actually want computer systems that can learn deep patterns in our behavior from raw traces? The most fundamental question - and one to which computer science is poised to make significant contributions - is how much understanding of ourselves as human beings we can gain with the help of these new scientific developments.

 


The materials on this webpage, including speakers' slides and videos, are copyright the author(s).
Permission is granted for non-commercial use with credit to the author(s) and the Computing Community Consortium (CCC).