From - Thu Jan 14 15:49:08 1999

Panoramic Human Tracking

Back to Zhigang's Homepage | Computer Science | School of Engineering | CCNY

Real-time processing is essential for the dynamic and unpredictable environments such as earthquake
victims or people in a building on fire. It is important for visual sensing to rapidly focus attention on important activity in the environment. Any room or corridor should be searched quickly to detect people and fire. Thus, we employ a camera with a panoramic lens to detect and track multiple objects in motion in a full 360-degree view in real time.

Here is a list of vision functionality fulfilled, or to be incorporated (marked by *).

Cylindrical Image Unwarping - off-line camera calibration and on-line geometric transformation for removing the circularly spherical distortion.

Motion-Based Object Detection - from a moving or stationary robot (but ego-motion greatly increases the complexity of processing); updating the background and segmenting the moving object from the background.

Multiple Moving Object Tracking - after detection of moving objects, each is tracked in 2D and 3D; identify and analyze dynamics of moving objects; motion, texture, and shape cues used in tracking process.

3D Mapping* - stereo and motion processing used to construct 3D location of moving objects, significant events, obstacles to motion, and boundaries of rooms as necessary to goal-oriented task.

Object Identification* - finding people, fire, room entrances and exits; use appearance-based methods (using static color or gray-level patterns) and/or temporal-appearance-based methods (using motion patterns from tracking)

Human Detection* - humans can be detected even when stationary by focus-of-attention from previous motion, appearance-based matching, face detection, sound localization, etc.

Graphic User Interface(GUI): contours and tracks of moving objects superimposed on cylindrical video mosaic representation for system testing and interface to remote tele-presence visualizations. We have implemented a prototype version of the GUI.

In the DARPA-funded Active Software Composition(ASC) project at the University of Massachusetts initial results have been achieved for some of the described methodology. Figure 1 depicts multiple moving human object detection and tracking. Part (a) shows one of the original Panoramic Annular Lens (PAL) images from demonstrations at a DARPA ASC site visit. Multiple moving objects (4 people) were detected in real-time while moving around in the scene in an unconstrained manner while the sensor is stationary. A background image is generated automatically by tracking dynamic objects across multiple frames (depending on the number of moving objects in the scene).

Fig. 1 Panoramic vision for multiple human tracking

The four moving objects are shown in the un-warped cylindrical image of Fig. 1 (b), a more natural panoramic representation for user interpretation. Each of the four people were completely extracted from the complex background as depicted by the bounding rectangle, direction, and distance to each object. The system tracks each object through the image sequence, even if there are overlap and occlusion between two people. The dynamic track, represented as a small circle and icon (elliptic head and body) for the last 30 frames of each person is shown in Fig. 1(c) in different colors. The final object image is depicted at the end of the corresponding track. Notice that the humans reversed directions, and that overlap and occlusion were successfully handled (see the blue and the green sequences). The system can detect the self-motion, change in the environment, illumination, and sensor failure, while refreshing the background accordingly.