POST: Visual Navigation by Integrating

Panoramic Vision, Omnidirectional Vision and STereo Vision


Back to Zhigang's Homepage | Computer Science | School of Engineering | CCNY


Visual navigation of a mobile robot in a natural environment has always been a very interesting but challenging problem. It involves almost every aspect of computer vision research - from visual sensors through robust algorithms to visual representations. The basic requirements of visual navigation include global localization (to decide where to go), road following (to stay on the road) and obstacle detection (to avoid collision). Only after these safety requirements have been satisfied, which have been proven to be not a trivial problem, can the robot pursue other task-oriented goals. It is clear that visual environment modeling is the foundation of these basic issues in visual navigation - and it may extend to most of the real world problems in computer vision.

Fig. 1. POST: Panoramic, Omnidirectional and Stereo Vision System for Robot Navigation

In Dr. Zhu's Ph.D. thesis (Tsinghua University, 1997), a purposive, multi-scale and full-view visual scene modeling approach is proposed for visual navigation in a natural environment. As a typical instance, an integrated system POST is proposed which combines three novel modules (Fig. 1): Panoramic vision for landmark recognition, Omnidirectional vision for road understanding and STereo vision for obstacle detection. This approach tries to overcome the drawbacks of traditional visual navigation methods that have mostly depended on local and/or single view visual information. However, the proposed approach is not just a simple combination of the three novel sensors and methods, but rather a systematic integration under the strategy of purposive vision (“the right way for the right work”), and under the philosophy of a systems approach which emphasizes that “the whole is more than sum of its components”. Thus, correct sensor design, adequate levels of scene representation and corresponding robust and fast algorithms are specifically explored for each given task while the interconnection among the vision sub-systems are taken into consideration under the overall goal of autonomous navigation. The POST approach consists of the following three closely cooperating but relative independent modules:


  • Panoramic Vision for Landmark Selection
  • A two-stage method is presented for 3D panoramic scene modeling for landmark selection. As inputs, image sequences are captured by a video camera subject to small but unpredictable fluctuation on a common road surface. First, a 3D image stabilization method is proposed which eliminates fluctuation from the vehicle’s smooth motion so that "seamless" panoramic view images (PVIs) and epipolar plane images (EPIs) can be generated. Second, an efficient panoramic EPI analysis method is proposed to combine the advantages of both PVIs and EPIs efficiently in two important steps: frequency domain locus orientation detection, and spatio-temporal domain motion boundary localization. The two-stage method not only combines Zheng-Tsuji’s PVI method with Bolle-Baker’s EPI analysis, resulting in the so-called panoramic EPI method, but also generalizes them to handle image sequences subject to small but unpredictable camera fluctuations. Since camera calibration, image segmentation, feature extraction and matching are completely avoided, all the proposed algorithms are fully automatic and rather general.  Finally, a compact representation in the form of a 3D panorama for a large-scale scene is constructed that can be used effectively for generalized landmark selection for robot navigation (Fig. 2).  This method can further be applied to image-based rendering.



    (1) panoramic texture map

    (2) panoramic depth map

    (3) parallel projection of the 3D panorama
    Fig. 2.  3D Panoramic representation for landmark selection


     


  •  Omnidirectional Vision for Road Following
  • A new road following approach, the Road Omni-View Image Neural Networks (ROVINN), has been proposed (Fig. 3). It combines the omnidirectional image sensing technique with neural networks in such a manner that the robot is able to learn recognition and steering knowledge from the omnidirectional road images that in turn guarantee that the robot will never miss the road. The ROVINN approach brings Yagi’s COPIS (conic omnidirectional projection image sensor) method to outdoor road scenes and provides an alternative solution different from CMU’s ALVINN system. Compact and rotation-invariant image features are extracted by integrating an omnidirectional eigenspace representation with frequency analysis, using principal component analysis (PCA) and Fourier transforms (DFTs). The modular neural networks of the ROVINN estimate road orientations more robustly and efficiently by classifying the roads as a first step, which enables the robot to adapt to various road types automatically.


    Fig. 3. ROVINN architecture and the interconnection with other two modules ( RCN: Road Classification Network; RON: Road Orientation Network; DFM: Data Fusion Module; IPM: Image Processing Module; DFT: Discrete Fourier Transform; PCA: Principal Component Analysis)


     


  • Single-Camera Stereo Vision for Obstacle Detection
  • A novel method called the Image Gaze Transformation is presented for stereo-vision-based road obstacle detection. Obstacle detection is modeled as a reflexive behavior of detecting anything that is different from a planar road surface. Dynamic gaze transformation algorithms are developed so that the algorithms can work on a rough road surface. The novelty of the (dynamic) gaze transformation method, which resembles gaze control of the human vision, lies in the fact that it brings the road surface to zero disparity so that the feature extraction and matching procedures of traditional stereo vision methods are completely avoided in the proposed obstacle detection algorithms (Fig. 4). The progressive processing strategy from yes/no verification, through focus of attention, to 3D measurement based on the reprojection transformation make the hierarchical obstacle detection techniques efficient, fast and robust.



    Fig. 4. Image gaze transformation and obstacle detection. Top: Left and right view in a single camera image; Bottom-left: rectified left image by gaze transformation; Bottom-right: obstacle region after zero-disparity gaze control. The difference image shows that the ground images have been registered.


    Related Publications

    [Books and Journal papers]

  •  Z. Zhu, Full View Spatio-Temporal Visual Navigation - Imaging, Modeling and Representation of Real Scenes, China Higher Education Press, December 2001, First Hundred National Excellent Doctorate Dissertations Series.
  • Zhigang Zhu, Shiqiang Yang, Guangyou Xu, Xueyin Lin, Dingji Shi "Fast road classification and orientation estimation using omni-view images and neural networks," IEEE Transaction on Image Processing, Vol. 7, No.8, August 1998, pp. 1182-1197.
  • Zhigang Zhu, Guangyou Xu, "Neural networks for omni-view road image understanding," Journal of Computer Science and Technology, vol. 11, no 6, November 1996, Allerton Press Inc, pp. 542-550.
  • Zhigang Zhu, Xueyin Lin, Guangyou Xu, Real-Time Visual Obstacle Detection Based on Reprojection Transformation, Journal of Automation, to appear (in Chinese)
  • Zhigang Zhu, Xueyin Lin, Dingji Shi, et al, A real-time visual obstacle detection system based on reprojection transformation, Computer Research and Development, to appear (in Chinese)
  • Xueyin Lin, Zhigang Zhu, Wen Deng, "A Stereo Matching Algorithm Based on Shape Similarity for Indoor Environment Modeling," Journal of Computers, Vol.20, No.7, July 1997, pp. 654-660 (in Chinese).
  • Zhigang Zhu, Guangyou Xu, Xueyin Lin, Dingji Shi, Multi-scale and full-view visual navigation, Robot, 1998, pp 266-272 (in Chinese)
  • Zhigang Zhu, Guangyou Xu, Xueyin Lin, Dingji Shi, "Spatio-temporal multiple-scale functional vision of an intelligent mobile robot," Computer Research and Development, vol. 34, Suppl., Oct. 1997, pp 48-53 (in Chinese).
  • Guangyou Xu, Zhigang Zhu, Xueyin Lin, Dingji Shi, "Multiple special-designed visual sensing techniques and systems for the comprehensive understanding of outdoor road environment," High Technology Letters, Vol.7, No.8, August 1997, pp.9-13 (in Chinese).
  • Zhigang Zhu, Guangyou Xu, Xueyin Lin, Dingji Shi, "Comprehensive understanding of multiple scale, omni-directional spatio-temporal images," Journal of Tsinghua University, vol. 37, no 3, pp.12-15, 1997 (in Chinese).
  • Zhigang Zhu, Guangyou Xu, Haojun Xi, "Visual navigation by using rotation-invariance images and neural networks," Journal of Image and Graphics, vol. 1, no 5/6, pp. 367-375, October 1996 (in Chinese).
  • Zhigang Zhu, Xueyin Lin, Guangyou Xu, "Multi-distance directional fields method for qualitative visual navigation," Robot, vol. 16, no. sup.1 , pp. 133-138, August 1994 (in Chinese).
  • Zhigang Zhu, Xueyin Lin, "Real-time visual obstacle detection by using reprojection transformation," Journal of Tsinghua University, vol. 33, no s1, pp. 155-162, March 1993 (in Chinese).
  • [Conference papers]
  • Zhigang Zhu, Xueyin Lin, Dingji Shi, Guangyou Xu, A Single Camera Stereo System for Obstacle Detection, World Multiconference on Systemics, Cybernetics and Informatics (SCI'98) / The 4th International Conference on Information Systems Analysis and Synthesis (ISAS'98), July 12-16, 1998, Orlando, U.S.A. vol 3, pp 230-237.
  • Zhigang Zhu, Shiqiang Yang, Dingji Shi, Guangyou Xu, "Better road following by integrating omni-view images and neural nets," 1998 World Congress on Computational Intelligence (WCCI-98), Proceedings of IJCNN, Part 2 (of 3), vol 2, May 4-9, 1998, Anchorage, Alaska, USA, pp 974-979.
  • Zhigang Zhu, Haojun Xi, Guangyou Xu, "Combining rotation-invariance images and neural networks for road scene understanding," Proc. IEEE International Conference on Neural Network, pp. 1732-1737, 1996.
  • Xueyin Lin, Zhigang Zhu, Wen Deng, "A Stereo Matching Algorithm Based on Shape Similarity for Indoor Environment Model Building," Proc. IEEE International Conference Robotics and Automation, 1996, pp 765-770.
  • Zhigang Zhu, Guangyou Xu, Shaoyun Chen, Xueyin Lin, "Dynamic obstacle detection through cooperation of purposive visual modules of color stereo and motion", Proc. IEEE International Conference on Robotics and Automation, San Diego, May 1994, pp 1916-1922.
  • Zhigang Zhu, Guangyou Xu, Jian Peng. Dingji Shi, "Route understanding using spatio-temporal images viewed through a cross window", Proc. Asian Conference on Computer Vision, Osaka, Japan, 1993, pp. 35-38.
  • Xueyin Lin, Shaoyun Chen, Zhigang Zhu, "Visual navigation by detecting frame difference in image sequence", Proc. SPIE Mobile Robot VIII, vol 2058, Boston, MA,1993, pp 105-115
  • Shaoyun Chen, Xueyin Lin, Zhigang Zhu "Qualitative visual navigation using weighted correlation", Proc IEEE Conference on Computer Vision and Pattern Recognition, New York, USA, 1993, pp 620-624.
  • Zhigang Zhu, Guangyou Xu, Shaoyun Chen, Xueyin Lin, "Dynamic obstacle detection by purposively integrating binocular color image sequence", Proc International Conference on Neural Networks and Signal Processing, Guangzhou, 1993.
  • Xueyin Lin, Zhigang Zhu, Wenxia Yu, " Dynamic environment understanding for road following", Proc International Conference on Intelligent Information Processing Systems, Beijing, 1992
  • Xueyin Lin, Zhigang Zhu, "Detecting height from constrained motion", Proc. IEEE 3rd International Conference on Computer Vision, Osaka, Japan, 1990, pp. 503-506
  • Zhigang Zhu, Xueyin Lin, " Real-time algorithms for obstacle avoidance by using reprojection transformation", Proc. IAPR Workshop on Machine Vision and Applications, Tokyo, Japan, 1990, pp. 393-396.
  • Xueyin Lin, Zhigang Zhu, Wenxia Yu, " 2D shape analysis technique for road images," Proc 6th National Image and Graphics Symposium, Beijing, China, April 1992 (Best conference paper) (in Chinese).
  • Zhigang Zhu, Xueyin Lin, "Real-time obstacle detection based on reprojection transformation," The 2nd Computer Vision and Intelligent Control Symposium of China Artificial Intelligence Federation, Wuhan, China, October 1991 (Best conference paper) (in Chinese).
  • Zhigang Zhu, Xueyin Lin, "Reprojection transformation and applications in visual tasks of mobile robots," Proc 5th National Image and Graphics Symposium, Beijing, China, April 1990 (in Chinese)
  • Zhigang Zhu, Xueyin Lin, "Visual navigation and environment modeling for a mobile robot - height from motion," Proc 3rd National Robotics Symposium, Guilin , China, June 1990 (in Chinese)

  • Collaborators:

    Guangyou Xu, Professor, Department of Computer Science and Technology, Tsinghua University, Beijing
    Xueyin Lin, Professor, Department of Computer Science and Technology, Tsinghua University, Beijing
    Dingji Shi, Professor, Department of Computer Science and Technology, Tsinghua University, Beijing
    Shiqiang Yang, Professor, Department of Computer Science and Technology, Tsinghua University, Beijing
    Sudents:
    Bo Yang, Ph.D. student,  Department of Computer Science and Technology, Tsinghua University, Beijing
    Shaoyun Chen, MS student,  Department of Computer Science and Technology, Tsinghua University, Beijing
    Wenxia Yu, MS student,  Department of Computer Science and Technology, Tsinghua University, Beijing
    Haojun Xi, MS  student,  Department of Computer Science and Technology, Tsinghua University, Beijing
    Jian Peng, MS  student,  Department of Computer Science and Technology, Tsinghua University, Beijing
    Wen Deng, MS  student,  Department of Computer Science and Technology, Tsinghua University, Beijing


    Supported by
  • China National Advanced Research Program, Image Mosaic-based Visual Scene Modeling and Target Recognition, 1/1996-12/2000, Co-PI (Xu, Zhu)
  • China National Science Foundation, Image Stabilization Technique for Mobile Robot,   06/1995- 09/1997, PI (Zhu)
  • China National Science Foundation, Motion Estimation Using Gradients of Image Intensity, 1/1994-12/1996, Co-PI (Zhu)
  • China National Advanced Research Program, Comprehensive Understanding of Multiple Sensing Systems, 1/1991-12/1995, Co-PIs (Xu, Zhu)


  •