Stochastic Spatio-Temporal Grammars for Images and Video
Jeffrey Mark Siskind
School of Electrical and Computer Engineering
Purdue University
Abstract
Probabilistic Context-Free Grammars (PCFGs) induce distributions over
strings.
Strings can be viewed as observations that are maps from indices to
terminals.
The domains of such maps are totally ordered and the terminals are
discrete.
We extend PCFGs to induce densities over observations with unordered
domains
and continuous-valued terminals. We call our extension Spatial Random
Tree
Grammars (SRTGs). While SRTGs are context sensitive, the inside-outside
algorithm can be extended to support exact likelihood calculation, MAP
estimates, and ML estimation updates in polynomial time on SRTGs. We call
this extension the center-surround algorithm. SRTGs extend mixture models
by
adding hierarchal structure that can vary across observations. The
center-surround algorithm can recover the structure of observations, learn
structure from observations, and classify observations based on their
structure. We have used SRTGs and the center-surround algorithm to
process
both static images and dynamic video. In static images, SRTGs have been
trained to distinguish houses from cars. In dynamic video, SRTGs have
been
trained to distinguish events such as entering, exiting, picking up,
putting
down, sitting down, and standing up. We demonstrate how the structural
priors
provided by SRTGs support these tasks.
Joint work with Charles Bouman, Shawn Brownfield, Bingrui Foo, Mary
Harper,
Ilya Pollak, and James Sherman.
Biography
Jeffrey Mark Siskind received the B.A. degree in computer science from the
Technion, Israel Institute of Technology in 1979, the S.M. degree in
computer
science from MIT in 1989, and the Ph.D. degree in computer science from
MIT in
1992. He did a postdoctoral fellowship at the University of Pennsylvania
Institute for Research in Cognitive Science from 1992 to 1993. He was an
assistant professor at the University of Toronto Department of Computer
Science from 1993 to 1995, a senior lecturer at the Technion Department of
Electrical Engineering in 1996, a visiting assistant professor at the
University of Vermont Department of Computer Science and Electrical
Engineering from 1996 to 1997, and a research scientist at NEC Research
Institute, Inc. from 1997 to 2001. He joined the Purdue University School
of
Electrical and Computer Engineering in 2002 where he is currently an
associate
professor.
His research interests include machine vision, artificial
intelligence, cognitive science, computational linguistics, child language
acquisition, and programming languages and compilers.