User-Sensitive Text Summarization

Noemie Elhadad, Columbia University

Abstract

Automatic text summarization has become increasingly established over the past ten years, paving the way for readily accessible, robust systems able to synthesize relevant information from one or several source documents. However, research in summarization so far has been premised upon certain assumptions that are both enabling and simultaneously limiting. For instance, most systems to date produce summaries using a generic notion of salience valid for all readers. This notion goes against what both the psycholinguistic and computational linguistic communities have long acknowledged: summarization is not only a function of the input documents but also of the reader's mental state -- who the reader is, what his knowledge before reading the summary consists of, and why he wants to know about the source texts. Yet acquiring a full model of the reader's mental state is very complicated, if not impossible. In this talk, I will present a novel approach to text summarization that produces user-sensitive text summaries. The main research hypothesis of my work is that even limited knowledge about the reader, gleaned automatically from readily available sources, bolsters the quality of summaries. I will introduce the main challenges entailed in generating user-sensitive summaries and the methods employed by my summarizer. Evaluation with subjects shows that user-sensitive summaries help access relevant information more efficiently than generic summaries of the same material.

Biography

Noemie Elhadad is a Ph.D. candidate in the Natural Language Processing group at Columbia University, under the supervision of Prof. Kathleen McKeown. Her research interests are text summarization, statistical text generation, user-modeling and digital libraries.