Project: Information Extraction and Summarization by Agents
Student Researchers: Lori Bryan, Joanne Pinheiro
Advisors: Dr. Cynthia Della Torre Cicalese, Dr. Beryl Hoffman
Institution: Marymount University




PROJECT GOALS

The development of the Internet and the WWW has vastly increased the amount of information available to users. To deal with this "information overload", computer scientists have been developing new tools and algorithms. Software Agents, autonomous programs that perform tasks such as information gathering, are becoming more common on the web. Automatic summarization has become an important area of research (still in its infant stages) to help users deal with information overload. Our CREW project involved the creation of multiple agents that communicate via message-passing using XML (Extensible Markup Language) and that are responsible for different types of information extraction and summarization on the web.

PROJECT PROCESS

This project combined the research areas of two faculty members and introduced two students to research in computer science. The members of the project met weekly to discuss the project. First, we read and discussed research papers and worked on the software design. The students also had to design the format of the XML messages that the agents were to use. Then, the students familiarized themselves with the the CoABS Grid (http://coabs.globalinfotek.com) which is a method-based application programming interface to register agents, advertise their capabilities, discover agents based on their capabilities, and send messages between agents. Learning to use the grid was challenging for the students, and they spent a lot of time learning how to build simple agents as Java classes. During our meetings, we designed a system of simple agents as the prototype of the system, and we designed a language that the agents could use to communicate with one another using XML. The students then built this prototype.

PROJECT RESULTS

The goal of this project was to develop an agent-based information extraction and summarization capability. Although our agents do not yet have summarization capability, our students were able to use the CoABS Grid to build multiple agents that communicate with one another using XML. They designed the format for the XML messages as well as designing and creating Java classes for the agents. The agents right now do simple information extraction, for example reporting the title of web pages returned from the search engine. We hope to expand our prototype with many other agents that have different information extraction and summarization capabilities. This project gave the students exposure to a number of leading-edge technologies as well as giving them the opportunity to work on a cooperative research project in computer science. The students reported that they learned a lot and enjoyed working on the research project.

STUDENTS AND FACULTY INVOLVED

Our project combined the research areas of two faculty members, Dr. Cynthia Cicalese and Dr. Beryl Hoffman, and involved two students, Lori Bryan and Joanne Pinheiro.