Automatic text summarization has become increasingly established over
the past ten years, paving the way for readily accessible, robust
systems able to synthesize relevant information from one or several
source documents. However, research in summarization so far has been
premised upon certain assumptions that are both enabling and
simultaneously limiting. For instance, most systems to date produce
summaries using a generic notion of salience valid for all
readers. This notion goes against what both the psycholinguistic and
computational linguistic communities have long acknowledged:
summarization is not only a function of the input documents but also
of the reader's mental state -- who the reader is, what his knowledge
before reading the summary consists of, and why he wants to know about
the source texts. Yet acquiring a full model of the reader's mental
state is very complicated, if not impossible.
In this talk, I will present a novel approach to text summarization
that produces user-sensitive text summaries. The main research
hypothesis of my work is that even limited knowledge about the reader,
gleaned automatically from readily available sources, bolsters the
quality of summaries. I will introduce the main challenges entailed
in generating user-sensitive summaries and the methods employed by my
summarizer. Evaluation with subjects shows that user-sensitive
summaries help access relevant information more efficiently than
generic summaries of the same material.
Biography
Noemie Elhadad is a Ph.D. candidate in the Natural Language Processing
group at Columbia University, under the supervision of Prof. Kathleen
McKeown. Her research interests are text summarization, statistical
text generation, user-modeling and digital libraries.