Big-Data Computing Study Group

Under sponsorship by the CCC, the Big-Data Study Group explores and enables opportunities for research and applications of high-performance, data-intensive computing systems, benefiting application areas ranging from astronomy to machine translation. To begin this effort, two events were held in March 2008.


Leads for this workshop and for effort

Randy Bryant (Carnegie Mellon University) and Thomas Kwan (Yahoo! Labs)

CCC council liaison for this workshop and effort

Ed Lazowska (University of Washington)

Hadoop Summit

[3/25/08, Sunnyvale, CA] | Speakers, Slides and Videos

Hadoop is an open source project developing software that enables data-intensive computing on cluster-based systems.  It includes a distributed file system and programming support for Map/Reduce, a data-parallel notation for expressing both element-wise and aggregating operations on collections of data.

Data-Intensive Computing Symposium

[3/26/08, Sunnyvale, CA] | Speakers, Slides and Videos

This symposium covered a broad range of topics, with presentations by industry and academic leaders on all aspects of data-intensive computing, including systems, programming, algorithms, data management, and both scientific and information-based applications. 


Bernie Acs (NCSA)

William Arms (Cornell)

Roger Barga (Microsoft)

Jacek Becla (SLAC)

Emery Berger (UMass-Amherst)

Christophe Bisciglia (Google)

Randy Bryant (CMU)

Andrew Chien (Intel)

Andrew Connolly (UWashington)

Jeff Dean (Google)

Christos Faloutsos (CMU)

Ian Foster (Argonne)

Dennis Gannon (Indiana)

Garth Gibson (CMU)

Robert Grossman (UI-Chicago)

Jeff Hammerbacher (Facebook)

Steve Heller (Sun)

Haym Hirsh (NSF/Rutgers)

Anita Jones (Virginia)

Randy Katz (Berkeley)

Jay Kistler (Yahoo!)

Ed Lazowska (UWashington)

Xiaozhou Li (HP Labs)

Qi Lu (Yahoo!)

Steve Meacham (NSF)

Marc Najork (Microsoft)

Dave O'Hallaron (Intel/CMU)

Kunle Olukotun (Stanford)

Savas Parastatidis (Microsoft)

Prabhakar Raghavan (Yahoo!)

Bina Ramamurthy (SUNY Buffalo)

Anne Rogers (Chicago)

Arie Shoshani (Lawrence Berkeley Laboratory)

Raymie Stata (Yahoo!)

Ravi Sundaram (Northeastern)

Paul Thompson (Dartmouth)

Andrew Tomkins (Yahoo!)

Dan Weld (UWashington)

Jeannette Wing (NSF)

Ke-Thia Yao (ISI/USC)

ChengXiang Zhai (UIUC)

Eugene Agichtein (Emory)

Eric Baldeschwieler (Yahoo!)

Chaitin Baru (SDSC)

Sugato Basu (Google)

Fran Berman (SDSC)

Andrei Broder (Yahoo!)

Jamie Callan (CMU)

Charlie Clarke (Waterloo)

Gene Cooperman (Northeastern)

Tina Eliassi-Rad (LLNL)

Usama Fayyad (Yahoo!)

Jim French (NSF)

Phil Gibbons (Intel)

Ian Gorton (Pacific NW National Lab)

Milton Halem (UM-BC)

Jiawei Han (UIUC)

Joe Hellerstein (Berkeley)

Chenyi Hu (Central Arkansas)

Richard Karp (Berkeley)

Yoo-Ah Kim (UConn)

Jon Kleinberg (Cornell)

Michael Lesk (Rutgers)

Xavier Llora (NCSA)

Chris Manning (Stanford)

Jill Mesirov (Broad Institute)

Nicholas Nystrom (Pittsburgh Supercomputing)

Chris Olston (Yahoo!)

Patrick Pantel (Yahoo!)

Beth Plale (Indiana)

Raghu Ramakrishnan (Yahoo!)

Dan Reed (Microsoft)

Mikael Ronstrom (MySQL AB)

Padhraic Smyth (UC Irvine)

Alex Szalay (JHU)

Douglas Thain (Notre Dame)

Cristian Ungureanu (NEC Labs)

Stephan Vogel (CMU)

John Wilkes (HP)

Jay Wylie (HP Labs)

Hongyuan Zha (GeorgiaTech)

Yi Zhang (UC Santa Cruz)


Establishing a Big-Data Computing Study Group - [72 KB PDF]

HP, Intel and Yahoo! Create Global Cloud Computing Research Test Bed
The HP, Intel and Yahoo! Cloud Computing Test Bed provided a globally distributed, Internet-scale testing environment designed to encourage research on the software, data center management and hardware issues associated with cloud computing at a larger scale than ever before. The initiative supported research of cloud applications and services.

Milestone Week in Evolving History of Data-Intensive Computing
By Randal E. Bryant, Carnegie Mellon University, and Thomas T. Kwan,Yahoo! Research
[Published originally in the May 2008 edition of Computing Research News, Vol. 20/No. 3]

NSF-CISE: Data-intensive Computing
Program description at the NSF site.

Progress Report: CCC's Support for Data-Intensive Computing
Status update

Big Data Computing Group Kicks Off
Blog post

Milestone Week in Evolving History of Data-Intensive Scalable Computing [85 KB PDF]



Data-Intensive Scalable Computing in Education (DISC 2008)
July 16 - 18, 2008, University of Washington, Seattle, WA

Cloud Computing and Its Applications 2008 (CCA-08)
October 22-23, 2008, Gleacher Center, Chicago, Illinois