|
|
Project: Information Extraction and Summarization by Agents
Student Researchers: Lori Bryan, Joanne Pinheiro
Advisors: Dr. Cynthia Della Torre Cicalese, Dr. Beryl Hoffman
Institution: Marymount University
PROJECT GOALS
The development of the Internet and the WWW has vastly increased the
amount of information available to users. To deal with this "information
overload", computer scientists have been developing new tools and
algorithms. Software Agents, autonomous programs that perform tasks such
as information gathering, are becoming more common on the web. Automatic
summarization has become an important area of research (still in its infant
stages) to help users deal with information overload. Our CREW
project involved the creation of multiple agents that communicate via
message-passing using XML (Extensible Markup Language) and that are
responsible for different types of information extraction and
summarization on the web.
PROJECT PROCESS
This project combined the research areas of two faculty members and introduced
two students to research in computer science. The members of the project
met weekly to discuss the project. First, we read and discussed research
papers and worked on the software design. The students also had to design
the format of the XML messages that the agents were to use. Then, the
students familiarized themselves with the the CoABS Grid (http://coabs.globalinfotek.com)
which is a method-based application programming interface to register
agents, advertise their capabilities, discover agents based on their capabilities,
and send messages between agents. Learning to use the grid was challenging
for the students, and they spent a lot of time learning how to build simple
agents as Java classes. During our meetings, we designed a system of simple
agents as the prototype of the system, and we designed a language that
the agents could use to communicate with one another using XML. The students
then built this prototype.
PROJECT RESULTS
The goal of this project was to develop an agent-based information
extraction and summarization capability. Although our agents do not yet
have summarization capability, our students were able to use the CoABS
Grid to build multiple agents that communicate with one another using
XML.
They designed the format for the XML messages as well as designing and
creating Java classes for the agents. The agents right now do simple
information extraction, for example reporting the title of web pages
returned from the search engine. We hope to expand our prototype with
many other agents that have different information extraction and
summarization capabilities.
This project gave the students exposure to a number of leading-edge
technologies as well as giving them the opportunity to work on a
cooperative research project in computer science. The students reported
that they learned a lot and enjoyed working on the research project.
STUDENTS AND FACULTY INVOLVED
Our project combined the research areas of two faculty members, Dr.
Cynthia Cicalese and Dr. Beryl Hoffman, and involved two students,
Lori Bryan and Joanne Pinheiro.
|