CSc Senior Capstone Sequence 2004-2005
Computer Science - The City College of New York
 
Vision, Video and Virtual Reality

Instructor: Professor Zhigang Zhu



Augmented New York City - Vision,Video  and Virtual Reality in Traffic and Surveillance

Widely distributed sensor networks are becoming commonplace in our environments.  Web-available cameras allow anyone with an Internet connection to peek in real-time into traffic, national parks, sporting events, and office spaces; security guards have access to tens or hundreds of cameras; and robot explorers carry cameras and other sensors into dangerous or out-of-reach areas.  As a concrete example in one of the world’s largest cities, New York City DOT’s Traffic Management Center maintains 86 closed circuit TV cameras on major arteries of NYC trying to disentangle traffic jams. With an Intelligent Transportation Systems (ITS), the cameras and signals are controlled by operations staff using interfaces of projection graphics and arrays of video monitors in order to track and manage the system. DOT’s Real-Time Traffic Cameras provide online (http://nyctmc.org)) both streaming video and frequently updated still images from DOT camera locations for traffic advisory. A user (driver) can click the corresponding location of the camera in a top-view 2D map to activate a video stream or still images. In the City Drive Live program, twenty-two of DOT’s traffic cameras, showing live traffic conditions at major locations, can be watched on TV on Crosswalks Channel 74, the television network of the City of New York, on weekday mornings from 6:30 to 9:00 am. The program first slides a 2D map with camera locations marked, and then shows each video stream for a while, in a certain order. 

However, the individual sensors (cameras) generally provide constrained and separate viewpoints from which users experience the spatially disparate information. This limits the ability of users to immerse themselves in the experience provided by the space, to construct coherent models of the spatial geometry, and (in appropriate circumstances) to make real-time tactical decisions. We are interested in an augmented interface for real-time traffic cameras, where the video streams of the real traffic cameras are geo-registered with a 2D map or even a virtual 3D model of the streets and buildings of the city. A user could make a virtual walk-though in the augmented and immersive environment, viewing the real-time traffic from the sensor network.

Requirements: In this project you are suppose to have part of the 22 NYC DOT video streams registered on a NYC map, and implement a real-time virtual walkthrough. For doing that, you need to (1) register each camera view with the 2D map using planar transformation; and (2) the system can change viewpoint to let us virtually walk or fly through the map and view the real video with right view angles.

Tools: A 3D rendering system (OpenGL, Java 3D or other rendering tools).
Input: several traffic video sequences (for example segquence 1), and a 2D map

New video sequences in AVI (each from 5MB to 16MB, approximatly 30 seconds to 1.5 minutes, images 720*480 in color ):

Location 1 (time 1time 2)
Location 2 (time 1time 2)
Location 3 (time 1time 2)
Location 4 (time 1time 2)
Location 5 (time 1time 2)

Output: a system that can display the augmented NYC map

(1)    Study rendering tools (with C++)
(2)    Change viewpoints of a 2D map (Matlab)
(3)    Align video on the map (Matlab)


Reading:
OpenGL, Java 3D, planar transformation (lecture slides).


Copyright @ Zhigang Zhu (email zhu@cs.ccny.cuny.edu ), 2004.