List of Publications
The corresponding presentations for some of the papers can be found here.
Reconstructing Occluded Surfaces using Synthetic Apertures: Stereo, Focus and Robust Measures
Vaibhav Vaish, Richard Szeliski, Larry Zitnick, Sing Bing Kang, Marc Levoy.
Proc. CVPR 2006.
Poster
Most algorithms for 3D reconstruction from images use cost functions based on SSD, which assume that the surfaces being
reconstructed are visible to all cameras. This makes it difficult to reconstruct objects which are partially occluded.
Recently, researchers working with large camera arrays have shown it is possible to ``see through" occlusions using a
technique called synthetic aperture focusing. This suggests that we can design alternative cost functions that are robust
to occlusions using synthetic apertures. Our paper explores this design space. We compare classical shape from stereo with
shape from synthetic aperture focus. We also describe two variants of multi-view stereo based on color medians and entropy
that increase robustness to occlusions. We present an experimental comparison of these cost functions on complex light fields,
measuring their accuracy against the amount of occlusion.
Synthetic Aperture Focusing using a Shear-Warp Factorization of the Viewing Transform
Vaibhav Vaish, Gaurav Garg, Eino-Ville Talvala, Emilio Antunez, Bennett Wilburn, Mark Horowitz, Marc Levoy.
Proc. Workshop on Advanced 3D Imaging for Safety and Security
(in conjunction with CVPR 2005)
Oral presentation
Synthetic aperture focusing consists of warping and adding together the images
in a 4D light field so that objects lying on a specified surface are aligned
and thus in focus, while objects lying off this surface are misaligned and
hence blurred. This provides the ability to see through partial occluders such
as foliage and crowds, making it a potentially powerful tool for
surveillance. If the cameras lie on a plane, it has been previously shown that
after an initial homography, one can move the focus through a family of planes
that are parallel to the camera plane by merely shifting and adding the
images. In this paper, we analyze the warps required for tilted focal planes
and arbitrary camera configurations. We characterize the warps using a new
rank-1 constraint that lets us focus on any plane, without having to perform a
metric calibration of the cameras. We also show that there are camera
configurations and families of tilted focal planes for which the warps can be
factorized into an initial homography followed by shifts. This homography
factorization permits these tilted focal planes to be synthesized as
efficiently as frontoparallel planes. Being able to vary the focus by simply
shifting and adding images is relatively simple to implement in hardware and
facilitates a real-time implementation. We demonstrate this using an array of
30 video-resolution cameras; initial homographies and shifts are performed on
per-camera FPGAs, and additions and a final warp are performed on 3 PCs.
High Performance Imaging Using Large Camera Arrays
Bennett Wilburn, Neel Joshi, Vaibhav Vaish, Eino-Ville Talvala, Emilio Antunez, Adam Barth,
Andrew Adams, Marc Levoy, Mark Horowitz.
Proc. SIGGRAPH 2005
The advent of inexpensive digital image sensors, and the ability to create
photographs that combine information from a number of sensed images, is
changing the way we think about photography. In this paper, we describe a
unique array of 100 custom video cameras that we have built, and we summarize
our experiences using this array in a range of imaging applications. Our goal
was to explore the capabilities of a system that would be inexpensive to
produce in the future. With this in mind, we used simple cameras, lenses, and
mountings, and we assumed that processing large numbers of images would
eventually be easy and cheap. The applications we have explored include
approximating a conventional single center of projection video camera with high
performance along one or more axes, such as resolution, dynamic range, frame
rate, and/or large aperture, and using multiple cameras to approximate a video
camera with a large synthetic aperture. This permits us to capture a video
light eld, to which we can apply spatiotemporal view interpolation algorithms
in order to digitally simulate time dilation and camera motion. It also permits
us to create video sequences using custom non-uniform synthetic apertures.
Synthetic Aperture Confocal Imaging
Marc Levoy, Billy Chen, Vaibhav Vaish, Mark Horowitz, Ian McDowell, Mark Bolas.
Proc. SIGGRAPH 2004
Confocal microscopy is a family of imaging techniques that employ focused
patterned illumination and synchronized imaging to create cross-sectional views
of 3D biological specimens. In this paper, we adapt confocal imaging to
large-scale scenes by replacing the optical apertures used in microscopy with
arrays of real or virtual video projectors and cameras. Our prototype
implementation uses a video projector, a camera, and an array of mirrors.
Using this implementation, we explore confocal imaging of partially occluded
environments, such as foliage, and weakly scattering environments, such as
murky water. We demonstrate the ability to selectively image any plane in a
partially occluded environment, and to see further through murky water than is
otherwise possible. By thresholding the confocal images, we extract mattes
that can be used to selectively illuminate any plane in the scene.
Using Plane + Parallax to Calibrate Dense Camera Arrays
Vaibhav Vaish, Bennett Wilburn, Neel Joshi, Marc Levoy.
Proc. CVPR 2004
Oral Presentation
A light field consists of images of a scene taken from different viewpoints.
Light fields are used in computer graphics for image-based rendering and
synthetic aperture photography, and in vision for recovering shape. In this
paper, we describe a simple procedure to calibrate camera arrays used to
capture light fields using a plane + parallax framework. Specifically, for the
case when the cameras lie on a plane, we show (i) how to estimate camera
positions up to an affine ambiguity, and (ii) how to reproject light field
images onto a family of planes using only knowledge of planar parallax for one
point in the scene. While planar parallax does not completely describe the
geometry of the light field, it is adequate for the first two applications
which, it turns out, do not depend on having a metric calibration of the light
field. Experiments on acquired light fields indicate that our method yields
than better results than full metric calibration.
High Speed Video Using a Dense Camera Array
Bennett Wilburn, Neel Joshi, Vaibhav Vaish, Marc Levoy, Mark Horowitz.
Proc. CVPR 2004
Oral Presentation
We demonstrate a system for capturing multi-thousand frame-per-second
(fps) video using a dense array of cheap 30fps CMOS image sensors. A
benefit of using a camera array to capture high speed video is that we
can scale to higher speeds by simply adding more cameras. Even at
extremely high frame rates, our array architecture supports continuous
streaming to disk from all of the cameras. This allows us to record
unpredictable events, in which nothing occurs before the event of
interest that could be used to trigger the beginning of recording.
Synthesizing one high speed video sequence using images from an array
of cameras requires methods to calibrate and correct those cameras'
varying radiometric and geometric properties. We assume that our scene
is either relatively planar or is very far away from the camera and
that the images can therefore be aligned using projective
transforms. We analyze the errors from this assumption and present
methods to make them less visually objectionable. We also present a
method to automatically color match our sensors. Finally, we
demonstrate how to compensate for spatial and temporal distortions
caused by the electronic rolling shutter, a common feature of low-end
CMOS sensors.
Robust Fingerprint Authentication Using Local Structural Similarity
Nalini Ratha, Vinayaka Pandit, Ruud Bolle, Vaibhav Vaish.
Proc. Workshop on Applications on Computer Vision, 2000.
Fingerprint matching is challenging as the matcher has to minimize two competing error rates: the False
Accept Rate and the False Reject Rate. We propose a novel, efficient, accurate and distortion-tolerant
fingerprint authentication technique based on graph representation. Using the fingerprint minutiae features,
a labeled and weighted graph of minutiae is ocntructed for both the query fingerprint and the reference
fingerprint. In the first phase, we obtain a minimum set of matched node pairs by matching their neighborhood
structures. In the second phase, we include more pairs in the match by comparing distances with respect to
matched pairs obtained in the first phase. An optional third phase, extending the neighborhood around each
feature, is entered if we cannot arrive at a decision based on the analysis in the first two phases. The
proposed algorithm been tested with excellent results on a large private livescan database obtained with
optical scanners.
Synthetic Aperture Focusing Using Dense Camera Arrays
Vaibhav Vaish, Gaurav Garg, Eino-Ville Talvala, Emilio Antunez, Bennett Wilburn, Mark Horowitz, Marc Levoy.
To appear in
3D Imaging for Safety and Security, Springer Verlag.
Editors: A. Koschan, M. Pollefeys, M. Abidi.
Synthetic aperture focusing consists of warping and adding together the images in a 4D light field so that objects lying on a
specified surface are aligned and thus in focus, while objects lying off this surface are misaligned and hence blurred. This
provides the ability to see through partial occluders such as foliage and crowds, making it a potentially powerful tool for
surveillance. In this paper, we describe the image warps required for focusing on any given focal plane, for cameras in general
position without having to perform a complete metric calibration. We show that when the cameras lie on a plane, it is possible to
vary the focus through families of frontoparallel and tilted focal planes by shifting the images after an initial recitification.
Being able to vary the focus by simply shifting and adding images is relatively simple to implement in hardware and facilitates a
real-time implementation. We demonstrate this using an array of 30 video-resolution cameras; initial homographies and shifts are
performed on per-camera FPGAs, and additions and a final warp are performed on 3 PCs.
Extended version of our
shear-warp factorization paper.