PRISM: Parallel Ray Interpolation for Stereo Mosaicing

 

Back to Zhigang's Homepage | Computer Science | School of Engineering | CCNY


We have presented a novel method for automatically and efficiently generating stereoscopic mosaics by seamless registration of optical data collected by a video camera mounted on an airborne platform that mainly undergoes translating motion. The resultant mosaics are seamless and will exhibit correct three-dimensional (3D) views. The basic idea is to construct stereo mosaics before 3D recovery for applications such as image-based rendering and environmental monitoring.  An important part of the work that follows is a new mosaic representation that can support seamless mosaicing under a rather general motion and also can capture inherent 3D information during the mosaic process. A parallel-perspective model is selected for representing mosaics in our approach since it is the closest form to the original perspective video sequence under large motion parallax, yet its geometry allows us to generate seamless stereo mosaics. To accomplish this, we propose a novel technique called PRISM (parallel ray interpolation for stereo mosaicing) to efficiently convert the sequence of perspective images with dramatically changing viewpoints into the parallel-perspective stereo mosaics.

Example 1 : Stereo Mosaics from a Single Video Sequence of a forest scene


  Left View Mosaic   (full resolution JPEG file, 1.8 MB)
    Right View Mosaic   (full resolution JPEG file, 1.8 MB)   3D Recovery from Stereo Mosaics

  Depth Map ( the brighter, the nearer) (full resolution JPEG file, 384 KB)   Stereoscopic Viewing
(Full resolution JPEG file, 2.8 MB)  In a RGB image, R channel is the R band of the right mosaic and B/G channels are the B/G bands of the left mosaic. By wearing a pair of left-blue/ right-red glasses, you can precept vivid color 3D effect.  

Basic Geometry of the stereo mosaics

Let us first assume the motion of a camera is a 1D translation, the optical axis is perpendicular to the motion, and the frames are dense enough.  Then, we can generate two spatio-temporal images by extracting two columns of pixels (perpendicular to the motion) at the front and the rear edges of each frame. These mosaic images thus generated are similar to parallel-perspective images captured by a linear pushbroom camera, which has perspective projection in the direction perpendicular to the motion and parallel projection in the motion direction. In contrast to the common pushbroom aerial image, these mosaics are obtained from two different oblique viewing angles of a single camera's field of view, one set of rays looking forward and the other set of rays looking backward, so that a stereo pair of left and right mosaics can be generated as the sensor moves forward.

Since a fixed angle between the two viewing rays is selected for generating the stereo mosaics, the "disparities" of all points are fixed; instead geometry of optimal/adaptive baselines for all the points is created.  In other words, for any point in the left mosaic, searching for the match point in the right mosaic means finding an original frame in which this match pair has a fixed disparity and hence has an adaptive baseline depending on the depth of the point.
 

PRISM (parallel ray interpolation for stereo mosaicing) approach

In the PRISM approach for large-scale 3D scene modeling, the computation of "match" is efficiently distributed in three steps: camera pose estimation, image mosaicing and 3D reconstruction. In estimating camera poses (for image rectification), only sparse tie points widely distributed in the two images are needed. In generating stereo mosaics, matches are only performed for ray interpolation between small overlapping regions of successive frames. In using stereo mosaics for 3D recovery, matches are only carried out between the two final mosaics, which is equivalent to finding a matching frame for every point in one of the mosaics with a fixed disparity. Thus stereo mosaics using parallel-perspective projection are a compact and efficient way to represent 3D information of a scene over a large spatial scale under a rather general motion.

 Example 2: Stereo Mosaics from a Single Video Sequence of Umass Campus

The video sequence contains 1000+ frame 720*480 color images. The mosaics in the following were created from a temporally sub-sampled image sequence of every 10 frames.         1.  Stereo Mosaics by using a 2D mosaicing technique: where geometric seams can be observed Due to large and varying displacements between each pair of successive frames in the image sequence, extracting a one-column slices from each frame is not sufficient to form uniformly dense mosaics. In a "manifold mosaic", each image contributes a slice to the mosaic. The width of the slice is a function of the displacements between frames. For a translating camera, manifold mosaic can be modeled as a multi-perspective image: Each sub-image (with more than one columns) taken from the original image is of full perspective, but sub-images from different frames will have different viewpoints. This may cause geometric seams in the mosaic due to motion parallax under translation.

 Left mosaic (1 MB JPEG)     Right mosaic (1 MB JPEG)
 
 

2.  High Resolution Stereo Mosaics by using the proposed PRISM technique
  How can we generate seamless mosaics in a computationally effective way? The key to our approach lies in the parallel-perspective representation and a novel PRISM (parallel ray interpolation for stereo mosaicing) approach. For each of the left and right mosaics, we only need to take a front (or rear) slice of a certain width (determined by the interframe motion) from each frame, and perform local registration between the overlapping slices of successive frames. We then directly generate parallel interpolated rays between two known discrete perspective views for the left (or right) mosaic. Our approach is similar to image synthesis by view interpolation, which has been well studied in image-based rendering. Fortunately in our case, we only need to perform small number of parallel-perspective ray interpolation instead of a complete view interpolation between a pair of successive images. In addition, the distance between two successive views are small, so the synthetic parallel-perspective rays between the two known views are not subject to serious occlusion problems.   As a result, each mosaic is a parallel-perspective composite image, the viewpoints of which are on a smoothly (interpolated) curve.

Left mosaic (1 MB JPEG)    Right mosaic (1 MB JPEG)
 
 

3. High Resolution Color Stereoscopic Viewing (Stereo mosaics 1.5 MB)

Example 3: Stereo Mosaics from a Single Video Sequence (Bolivia Data Set)

        0.  Original video (AVI  Movie: 10 MB)
  1.  3D recovery from stereo mosaics


 

Left image   (mosaic_left.jpg, 914 KB)            Right image   (mosaic_right.jpg, 947 KB)

Depth map ( the brighter, the nearer) (mosaic_depth_map.jpg, 601 KB)
2.  Stereoscopic viewing (R_B mosaics 1.2 MB)

Example 4: High Resolution Multi-View Mosaics

In fact, mosaics with more than two oblique viewing angles can be generated from a single video sequence. It generates a set of "multi-disparity" stereo mosaics (which is analogous to a multi-baseline stereo vision system). Here is an example of five such mosaics.
  Far Left Eye (lview2.jpg (3.4MB))

Left Eye (view2.jpg (3.4MB)

Central Eye (view0.jpg (3.5MB)

Right Eye (view1.jpg (3.3MB)

Far Right Eye (lview1.jpg (3.4MB)



 

Related  Publications:

 


Collaborators:

Edward M. Riseman, Professor
Allen R. Hanson, Professor
Howard Schultz, Senior Research Scientist

Frank Stolle,  Ph.D. student
Harpal S. Bassali, graduate student
Chris Holmes, system programmer


Supported by:

National Science Foundation Project (Grant Number EIA- 9726401), Automatic Interpretation of High-Altitude Image Data for Eco-System Modeling, $1,800,000, 02/01/98 – 01/31/01, PI (Riseman), Co-PIs (Hanson)