Video Computing and 3D Computer Vision

CSc 80000, Section 2

Spring 2005

Prof. Zhigang Zhu
Associate Professor of Computer Science
The City College of New York and Graduate Center
The City University of New York (CUNY)

Time: Tuesday 6:30 - 8:30 pm
Room: 4422
Credits: 3.0

Office Hours: Tuesday 5:00 – 6:00 pm, Rm 4439

Course Description

Computer Vision<> has a rich history of work on stereo and visual motion, which has dealt with the problems of 3D reconstruction from binocular or N-ocular images, and structure from motion from video sequences. Recently, in addition to these traditional problems, the stereo and motion information present in multiple images or a video sequence is also being used to solve several other problems, for instance video mosaicing, video synthesis, video segmentation, video compression, video registration, and video surveillance an monitoring. This is summarized as Video Computing. Computer vision is playing an important and somewhat different role in solving these problems in video computing than the original image analysis considered in the early days of vision research.

The course "Video Computing and 3D Computer Vision" will include advanced topics in video computing as well as fundamentals in stereo and motion. Most of the advanced topics will be discussed in the form of paper reading. Topics include but are not limited to:

Video representation: video mosaicing, layered representations, omnidirectional stereo, Image-Based Rendering (IBR)
Video manipulation: motion segmentation, human tracking, video surveillance
Video compression: content-based video coding (MPEG4/7)
Video interface: Human-Computer Interaction (HCI) using vision

Course Organization

The course will consist of lectures (50%) by the instructor and presentations (50%) by students for their readings and projects (as shown above). Students who take the course for credits will be required to finish 2-3 assignments consisting of both paperwork and programming (30%), to submit a term paper or complete a project on a topic related to the material presented in the lectures/readings (50%), and to give at least two presentations to the class in the middle and at the end of the semester (20%).

Syllabus

Part I. 3D Computer Vision Basics (lectures) - sensors, camera models, camera calibration, stereo vision, visual motion;

Lecture 1. Introduction – Video Computing and 3D Computer Vision (pps) – Feb 1

Lecture 2. Sensors (pps) and Image Formations (pps) - Feb 8 (Homework #1)

Lecture 3. Camera (pps) and Omnidirectional Camera (pps) Models – Feb 15

Lecture 4. Camera Calibration (pps) – Feb 22 (Homework #2)

Lecture 5. Stereo Vision (ppt) – March 1

Lecture 6. Visual Motion (ppt) – March 8 (Homework #3)

Part II. Video Computing (readings and projects) - Please check out the Reading List - NO CLASS on March 15.

Lecture 7. Video Mosaicing - Please go to the GC Computer Science Colloquium - My Talk @ 4:15 pm on March 17

Lecture 8. Omnidirectional Stereo Vision (pps) - March 22

(Please send your first and second choices among the five groups before March 21)

Student Reading Presentations

Each student will give a 30-minute presentation on the papers she/he have selected. Please send me your PPT slides before the class for me to post on the web site. Please write "CSc 80000 Section 2 Reading Presentation" in your Subject so that your email message will be directed to the vision course folder in my mailbox. Don't forget to do your final projects while you are reading papers!

April 5, 2005 - Motion and Factorization
    Li, Weihong: Layers and Motion
    Gutherc, Miriam C. : Factorization for SFM
    Cai, Kai, Factorization for Layer Extraction

April 12, 2005 - Stereo and OmniStereo
    Davidi, Ran, Stereo Mosaics
    Chowdhury, Sadat, Region-based Stereo
    Schultz, Anthony, Cooperative Stereo

April 19, 2005 - Vision for Robotics
    Chakravarthy, Narashiman, SIFT for Robotics
    Feng, Yi, SIFT and Robot Coordination
    Kammet, Joel M., Robot Cooperation

May 3, 2005 - Layers and Mosaics
    Chen, Wei, Layered Representation
    Dubowy, Joel, Graph Cut for Layer Extraction
    Fadaifard, Hadi, SIFT for Mosaicing

Student Project Presentations

Students are encouraged to prepare and work on your projects while you are doing the paper reading. You could either select to implement an algorithm proposed in a paper you are reading, or use the idea in a paper to fulfill the task you come up with. Please let me know as early as possible your project topics. Feel free to talk with me about your project ideas and problems in class and in my office hours. Each student is required to turn in a project report in hard copy, which include a title, an abstract, and brief literature review or background description, method or algorithm description, experimental design, results, conclusion and discussions. Please also prepare for a mini-presentation (10 min) with a demo (5 min).

May 10, 2005 - 6 students, demos in the classroom - TBD
May 17, 2005 - 6 students, demos in the classroom -TBD

Textbook and References

Textbook:
“Introductory Techniques for 3-D Computer Vision,” Emanuele Trucco and Alessandro Verri, Prentice Hall, Inc., 1998 (ISBN: 0132611082, 343 pages ).

References:

“Computer Vision – A Modern Approach,” David A. Forsyth, Jean Ponce, Prentice Hall, 2003 (ISBN: 0130851981 , 693 pages).
“Three Dimensional Computer Vision: A Geometric Viewpoint,” Olivier Faugeras, The MIT Press, November 19, 1993 (ISBN: 0262061589 , 695 pages)
“Image Processing, Analysis and Machine Vision,” Milan Sonka, Vaclav Hlavac, Roger Boyle, Prentice Hall,1999 (ISBN: 053495393X, 800 pages )
"Digital Image Processing , Concept, Algorithms and Scientific Applications," Jahne B, Springer-Verlag, 1991- You may find a very good description of separable convolution kernels and how to generate 1D/2D larger kernels from smaller 1D kernels in this book.

Supplements: Online References and additional readings when necessary.