skip to main | skip to sidebar
Showing posts with label Computers. Show all posts
Showing posts with label Computers. Show all posts

Tuesday, May 29, 2007

Resolving the panorama

This post is about image stitching methods used to make a panoramic image. Panoramic images have become important in the digital age. Initially, panoramic images were developed to increase the field of view on the photograph. In the digital age, because one cannot print out pictures with resolutions less than 200 dots per inch (explained here), the method to take print outs for posters is to take a number of photographs with at least 15% overlap and stitch them together later using some software. In order to take the individual pictures that make a panoramic picture, the best technique involves using a tripod so that the camera lens only moves on a sphere eliminating parallax error. In addition, the aperture and shutter speed should not vary between the various pictures. More tips on the techniques of panoramic pictures can be found all over Google or by sending an email to me. This post is more about the science behind stitching the images of a panoramic picture.

The idea of image stitching is to take multiple images and to make a single image from them with an invisible seam and such that the mosaic pictures remains true to the individual images (in other words, does not change the lighting effects too much). This is different from just placing the images side by side because there will be differences in the lighting between the 2 images and that would lead to a prominent seam in the mosaic picture.



This Figure shows 3 photos and the locations of the seams are shown in black boxes on each picture and the final mosaic formed from all three pictures.

The first step is to find points that are equivalent in 2 overlapping pictures [1]. This can be done by taking into consideration a certain amount of pixels in the neighborhood of a pixel from 2 pictures and finding the regions that overlap in colors between the 2 pictures. Then the images are placed or warped on a surface such as a cylinder (because the panoramic picture is a 2-dimensional representation of the overlapping pictures in a cylinder quite often). After this step the curve is found that gives the most amount of overlap between the equivalent pixels on both images. Then the images are stitched together with color correction. I will deal in this post with the various algorithms for color correction.



This figure is an example of the Feathering approach.


1. Feathering (Alpha Blending): In this method, at the seams (the regions of overlap), the pixels of the blended image are given colors that are effectively linear combinations of the pixel colors of the 1st image and the 2nd image. The effect is to blur the differences of both images at the edges. In this method, an optimal window size is found so that the blurring is least visible.




This figure shows the optimal blend between the 2 figures in the previous figure.


2. Pyramid Blending: In addition to the pixel representation of images, images can also be stored as pyramids. This is a data compression method in which a the image is stored as a hierarchy or pyramid of low-pass filtered versions of the original image so that successive levels correspond to lower frequencies (dividing the images into different layers that vary over a smaller or larger region of space so that the sum of it gives you the original image). During the blending method described above, the lower frequencies (which vary over a larger distance) are blended over spatially larger distance and the higher frequencies are blended over a spatially lower distance [1] causing a more realistic blended image to be formed. Here during the pyramid forming process, the 2nd derivatives of the images (Laplacian) are taken into consideration while forming the pyramid and the blended pyramid is formed and reintegrated to form the final blended image.



This figure shows the pyramid representation of the pixels in an image and the pyramid blending approach.

3. Gradient Domain Blending: Instead of making a low resolution mapping of the image as above, the gradient domain blending method requires the calculation of the 1st derivative of the images. Hence, the image resolution is not reduced before the blending process, but the idea is the same as above. This method is also developed to find the optimal window size for alpha blending and is adaptive to regions that vary fast or slower.


This figure shows the gradient blend approach.

Sources:
[1]: http://www.cs.huji.ac.il/course/2005/impr/lectures2005/Tirgul10_BW.pdf
[2]:
http://rfv.insa-lyon.fr/~jolion/IP2000/report/node78.html

Wikipedia article on feathering.

All figures taken from http://www.cs.huji.ac.il/course/2005/impr/lectures2005/Tirgul10_BW.pdf

Monday, October 09, 2006

How does a digital camera work?

In the previous post, I had talked about how a digital photograph is stored in the computer. In this post, I will talk about how a digital photograph senses the photograph talking only about the essential components (using description for RGB colors). A modern digital camera has far more advances than the simplistic picture explained here.

A digital camera has a number of lenses which focus light onto chips that are sensitive to incoming light. In the market, there are two types of image sensors - charge-coupled device (CCD), and the Complementary Metal Oxide Semiconductor (CMOS). CCDs [1] are far more popular than CMOS chips because they are considered to be affected by noise to a lesser extent (and I will use CCDs to explain how a digital camera works). The role of this chip is to sense the light that comes in and convert the light energy to an electric signal that is amplified and then digitized and finally processed.

How a CCD works? Photoelectric effect [2] is the property by which some metals emit electrons when light shines on them. The CCD in the digital camera is a silicon chip that is covered with a grid of small electrodes called photosites. One photosite corresponds to each pixel.

Before a photo is taken, the camera charges the surface of each photosite with electrons. When light strikes a particular side of the photosite, the metal at that site releases some electrons, which travel to the opposite end of the site (forming what is commonly called the capacitor). The larger the intensity of the light that falls on it, the larger the number of electrons that are released, and hence larger the voltage that develops across the photosite. The voltage is then converted to a number using an analog-to-digital converter that corresponds to the intensity of the light that falls on that site. This takes care of the intensity, but we have not discussed about how the photosite knows the color of the light.

As discussed earlier, the color of a pixel is formed by mixing red, green, and blue colors (RGB). So all the light does not hit each photosite, but rather, there is a filter placed on top of the photosite that only lets red, green, or blue color through. Hence, depending on what color is through, each photosite only measures the intensity of the red, green, or blue color that falls on it, and no other color. After this, to measure the intensity of green and blue colors on a site with a red filter, an interpolation algorithm (a process called demosaicing) is used that approximates the intensity of the blue and green light on that site using the intensity of these colors in the neighboring sites.

Lastly, as green is in the center of the spectrum in the visible light (VIBGYOR), our eye is better at picking up different shades of green, and hence, there are a larger number of photosites that sense green light than blue or red. The Bayer pattern shown below is the most common arrangement of photosites in a single array CCD chip.



The other end of expensive digital cameras (read 10's of thousands of dollars) have multiple arrays and avoid the interpolation step. So the incoming light could be split into three copies and then passed through three separate filters and three different arrays and sensed separately to make the final picture by merging these readings together.

There are more complications that arise even in the simple camera, but maybe another post to deal with them (but no promises as I got to do some research before I can post myself).

[1] CCDs were invented by George Smith and Willard Boyle at the Bell Labs.
[2] Albert Einstein won the Nobel prize in 1921 for the quantum explanation of photoelectric effect.

Source:
The source for most of this stuff is Chapter 2 of the Third Edition of the Complete Digital Photography by Ben Long, though the mistakes here are probably mine.

For further reading:

How Stuff Works answers how a camera works
How an image sensor works?
CCD vs CMOS
Wikipedia's CCD entry

Friday, September 29, 2006

How is a digital photo stored?

Just as the word digital suggests, a digital camera digitizes every image that it captures. What this means is that a typical rectangular photo is divided into a large number of squares which form the basic picture element called the pixel. Within a given pixel, the color of the image does not change. In the simple case where the pixel can either be black or white, the state of the pixel can be binary - either 0 (white) or 1 (black). The state of each pixel is stored at each location and when all the pixels are put together, you get your image back. Hence, the image is said to be digitized now.

In the real world, most pictures are stored either in color or in various layers of grey scale. For example, when each pixel element is stored as 8 bits, the number of states would be 256 - a number from 0 to 255. 0 represents white, 1 to 254 represents various scales of grey in increasing intensity, and 255 represents black. So when a color photo taken with a digital camera is converted to black and white, the correct term is actually a grey scale photo, because it has varying levels of black and white.

In color photographs, each pixel is made by combining various levels of primary colors. When an artist mixes colors with paints, he combines 2 colors to make a 3rd color. The primary colors are the colors which can combine to make all the colors required in a digital camera. In a typical camera, the primary colors used are Red, Green, and Blue (forming the RGB colors). Another set of primary colors used often are Cyan, Magenta, Yellow, and the Black (called the CMYK colors). Each color set has it's own advantages as it can capture a certain range of colors well. The rest of the post will be based on a RGB photo though it will equally apply to CMYK or any color set.

In the case of a RGB photo, each pixel uses up three storage units (for each color) of certain bit size associated with it. In the case of the 8-bit size storage unit, each pixel will have one storage unit each for intensity of red color (numbers 0 to 255 signifying varying levels of red color similar to the grey scale pictures discussed above), green color, and blue color. The various levels of red, green, and blue colors are mixed to form the actual color of that pixel. A picture stored in this manner would be 24-bit picture because each pixel would be stored in a 24-bit memory unit. The fact that the color of a pixel does not vary can be seen when you blow a picture beyond 100%. When the picture on the left is blown up to look at each pixel, the picture will look as the one on the right:



(Picture courtesy http://photo.net/equipment/digital/basics/)


The larger the resolution of the picture, the larger the number of pixels per inch (PPI). If one were to print a photo on paper, the resolution should be greater than 200 PPI on each side of the rectangle. A 3 MP camera can hence be used to print out photos of the 4" * 6" prints. The greater the resolution of the camera, the larger the size of the photo that can be printed out from it.

I will follow this post with a post on how the digital camera senses each photograph.

To get more information on this:
Photo.net's tutorial
Cambridge's tutorial
Subscribe to: Comments (Atom)
 

AltStyle によって変換されたページ (->オリジナル) /