Understanding
image sharpness part 1:
Introduction
to resolution and MTF curves
by Norman
Koren
updated February 26, 2007
Image
sharpness and
detail
[
画像:Cumulative effect of the film, lens, scanner and sharpening on the target]A
photograph's detail is an integral part of its appeal. Many
photographers
spend a great deal of time, energy and money acquiring equipment to
make
sharp images. Back in the film era, if 35mm didn't satisfy them, they
invested in medium
format, 4x5,
8x10, or larger.
(I know two who use
8x20 inch
cameras.) The digital versus film debate is now mostly settled (2007),
but there is still some debate over the relationship between the number
of megapixels and image quality. I love sharpness and
detail,
but I take my camera gear on long hikes, so I prefer to carry
lightweight
equipment. I need to know what it can achieve, how to get the most out
of it and what I'm trading off by not going to a larger format, apart
from
saving my back. That's what motivated this study.
The sharpness of a photographic imaging system or of a
component of
the system (lens, film, image sensor, scanner, enlarging lens, etc.) is
characterized
by a parameter called Modulation Transfer Function (MTF),
also known as spatial frequency
response. We present a unique visual explanation of MTF and how it
relates to
image quality. A sample is shown on the right. The top is a target
composed
of bands of increasing spatial frequency, representing 2 to 200 line
pairs
per mm (lp/mm) on the image plane. Below you can see the cumulative
effects
of the lens, film, lens+film, scanner and sharpening algorithm, based
on accurate
computer models derived from published data. If this interests you,
read
on. It gets a little technical, but I try hard to keep it readable.
- This page introduces
MTF and relates it to traditional resolution
measurements.
- Part 1A illustrates
its effect on film and lenses.
- Part 2 continues
with scanners (image sensors) and sharpening algorithms.
- Part
3 discusses printers and prints, and how to characterize
their sharpness
and resolution.
- Part
4 presents detailed printer
test results.
- Part
5 discusses lens testing
using a new downloadable target with continuously varying spatial
frequency.
- Part 6 discusses depth of field
(DOF), emphasizing sharpness at the DOF scale limits.
- Part 7 compares digital
cameras with film, and addresses the question, "How many pixels does it
take
for a digital sensor to outperform 35mm film?"
- Part
8 compares grain and sharpness for three scanners with a
well-crafted
enlarger print, and we look at grain aliasing and software solutions.
The companion website,
Imatest.com,
describes a software tool you can use to measure MTF and other factors
that contribute to image quality in digital cameras and digitized film
images.
Green is for
geeks. Do you get excited by a good
equation? Were you passionate
about your college math classes? Then you're probably a math geek— a member
of a maligned and misunderstood but highly elite fellowship. The text
in
green is for you. If you're normal or mathematically challenged, you
may
skip these sections. You'll never know what you missed.
Introduction
to modulation transfer function (MTF)
Back
in my youth, lens and film resolving power was measured in lines (or
line pairs) per millimeter (lp/mm)— easy to understand, but
poorly standardized.
It was obtained by photographing a chart (typically the
USAF
1951 lens test chart)
and looking for the highest resolution pattern
where detail was visible. Because perception and judgment were
involved, measurements of the same film or lens were highly
inconsistent. Lines per mm would have been more useful if it were
measured at
a well established contrast level, but that was not so easy; it would
have
required expensive instrumentation. The problem of specifying
resolution
and perceived sharpness was solved with the introduction of the
Modulation
transfer function (MTF),
a precise measurement made
in
frequency domain. This made optical engineers
happy, but confuses many photographers. The goal of this series is to
shed light on the
subject (literally as well as figuratively). I
include
software you can run yourself if you have
Matlab,
a popular program with engineers and scientists.
MTF
is the spatial frequency response of an imaging
system or a component;
it is the contrast at a given spatial frequency relative to low
frequencies.
Spatial
frequency is typically measured in cycles or line pairs per millimeter
(lp/mm), which is analogous to cycles per second
(Hertz) in audio
systems. Lp/mm is most appropriate for film cameras, where formats are
relatively fixed (i.e., 35mm full frame = 24x36mm), but cycles/pixel
(c/p) or line widths per picture height (LW/PH) may be more appropriate
for digital cameras, which have a wide variety of sensor sizes.
High
spatial frequencies correspond to fine image detail. The more extended
the response, the finer the detail—
the sharper the image.
Most of
us are familiar with the frequency of sound, which is
perceived
as pitch and measured in cycles per second, now called Hertz. Audio
components—
amplifiers, loudspeakers, etc.— are characterized by frequency
response
curves.
MTF
is also
a frequency response, except that it involves spatial
frequency—
cycles (line pairs) per distance (millimeters or
inches) instead
of time. The mathematics is the same. The plots on
these pages have
spatial frequencies that increase continuously from left to right. High
spatial frequencies correspond to fine image detail. The response of
photographic
components (film, lenses, scanners, etc.) tends to roll off at high
spatial
frequencies. These components can be thought of as
lowpass filters— filters that pass low frequencies and
attenuate high
frequencies.
Line
pairs or
lines?
All MTF charts and
most
resolution charts display spatial frequency in
cycles
or
line
pairs per unit length
(mm or inch). But there are exceptions. An old standard for measuring
TV
resolution uses line
widths
instead of
pairs,
where there are two line widths per pair, over the total height of the
display. When
dpreview.com
recommends multiplying the chart values in its lens tests by 100 to get
the total vertical lines in the image, they refer to line
widths,
not
pairs.
Confusing,
but I try to keep it straight.
Imatest
SFR displays MTF in cycles (line pairs)
per pixel, line widths per picture height (LW/PH; derived
from TV measurements), and line pairs per distance (mm or in).
The
essential meaning of MTF is rather simple. Suppose you have a pattern
consisting
of a pure tone (a sine wave). At frequencies where the MTF of an
imaging
system or a component (film, lens, etc.) is 100%, the pattern is
unattenuated—
it retains full contrast. At the frequency where MTF is 50%, the
contrast
half its original value, and so on. MTF is usually
normalized to
100% at very low frequencies. But it can go above 100% with interesting
results.
Contrast levels from 100% to 2% are illustrated on
the right for a variable
frequency sine pattern. Contrast is moderately attenuated for MTF = 50%
and severely attenuated for MTF = 10%. The 2% pattern is visible only
because viewing conditions are favorable: it is surrounded by neutral
gray,
it is noiseless (grainless), and the display contrast for CRTs and most
LCD displays is relatively high. It could easily become invisible under
less favorable conditions.
How is MTF
related to lines per millimeter resolution? The
old resolution measurement— distinguishable lp/mm— corresponds roughly
to spatial frequencies where MTF is between 5% and 2% (0.05 to 0.02).
This
number varies with the observer, most of whom stretch it as far as they
can. An MTF of 9% is implied in the definition of the
Rayleigh
diffraction limit.
Perceived
image sharpness (as distinguished from traditional
lp/mm resolution)
is closely related to the spatial frequency where MTF is 50% (0.5)—
where contrast has dropped by half.
One important detail:
MTF is not
the same as grain. Grain increases with
film speed: MTF is less sensitive to film speed.
MTF corresponds to the
bandwidth
of a communications system; grain corresponds to its
noise.
Grain can be characterized by a frequency spectrum (higher frequencies
correspond to finer grain patterns) as well as amplitude (intensity or
contrast).
Because there is no simple formula that determines how spectrum,
amplitude
and print magnification affect our perception of grain, Kodak has
devised
a subjective measure called "
Print
Grain Index."
Later
in this series
I hypothesize that the Shannon information capacity of an imaging system—
a function of bandwidth and noise— correlates
with perceived
image quality.
The
MTF curve on the right is for Fuji's highly
regarded
Provia 100F
slide film. It's typical except for one detail: MTF isn't 100%
at low spatial frequencies. This is an error— perhaps the
work of an overly
creative marketing department. The 50% MTF frequency (
f50
)
is about 42 lp/mm. MTF is only shown as far as 60 lp/mm. The resolution
of this film is rated as 60 lp/mm for 1.6:1 chart contrast and 140
lp/mm
for 1000:1 chart contrast. The latter number may be of interest to
astronomers,
but it has little to do with the perceived image sharpness of any
realistic
scenes.
The figure below represents a sine pattern (pure frequencies)
with spatial
frequencies from 2 to 200 cycles (line pairs) per mm on a 0.5 mm strip
of film. The top half of the sine pattern has uniform contrast. The
bottom
half illustrates the effects of Provia 100F on the MTF. Pattern
contrast
drops to half at 42 cycles/mm.
A
more precise definition of MTF based
on sine patterns: MTF is the contrast at a given spatial frequency (
f
) relative to contrast at low frequencies. These equations are used in
the page on
Lens
testing to calculate MTF from an image of a chart consisting
of sine
patterns of various frequencies, where the sine pattern contrast in the
original chart is assumed to be constant with frequency. (This series
uses
charts of continuously varying frequency.) Definitions:
.
VB
The minimum luminance (or
pixel value) for black
areas— at low
spatial frequencies. The frequency should be low enough
so that contrast doesn't change if it is reduced.
VW
The maximum luminance for
white areas—
at low
spatial frequencies.
Vmin
The minimum luminance for
a pattern near spatial
frequency f
(a "valley" or "negative peak").
Vmax
The maximum luminance for
a pattern near spatial
frequency f
(a "peak").
C(0)
=
(VW-VB)/(VW+VB)
is the low frequency (black-white) contrast.
C(
f ) =
(Vmax-Vmin)/(Vmax+Vmin)
is the contrast at spatial frequency f
. Normalizing
contrast in this way—
dividing by Vmax+Vmin (VW+VB
at low spatial frequencies)— minimizes errors due to gamma-related
nonlinearities
in acquiring the pattern.
MTF(
f )
=
100%*C(
f
)/C(0)
.
MTF can also be defined as
is the magnitude of
the Fourier transform of the point or line spread function— the response of an
imaging system to an infinitesimal point or line of light. This
definition is technically
accurate and equivalent to the sine pattern contrast definition, but
can't
be visualized as easily unless you're an engineer or physicist.
.
.
.
An
excellent opportunity to
collect high quality photographic prints and support this website
.
Imaging
systems
Systems
for reproducing information, images, or sound typically consist
of a chain of components. For example, audio reproduction systems
consist
of a microphone, mike preamp, digitizer or cutting stylus, CD player or
phono cartridge, amplifier, and loudspeaker.
Film imaging systems consist of a lens, film, developer,
scanner, image
editor, and printer (for digital prints) or lens, film, developer,
enlarging
lens, and paper (for traditional darkroom prints). Digital camera-based
imaging systems consist of a lens, digital image sensor, de-mosaicing
program,
image editor, and printer. Each of these components has a
characteristic
frequency response; MTF is merely its name in photography. The beauty
of
working in frequency domain is that
the
response of the entire system (or group of components) can be
calculated
by multiplying the responses of each component.
Typical
50% MTF frequencies are in the vicinity
of 40 to 80 lp/mm for individual components (lenses, film, scanners)
and
often as low as 30 lp/mm for entire imaging systems— much lower than the
80-160 lines/mm numbers typical of the old resolution measurements. It
takes some getting used to if you grew up with the old measurements.
The
response of a component or system to a signal in time or space can be
calculated
by the following procedure.
- Convert
the signal into frequency domain using a mathematical operation known
as
the Fourier transform, which is fast and easy to
perform on modern
computers using the FFT ( Fast Fourier Transform) algorithm. The result
of the transform is called the frequency components
or
FFT
of the signal. Images differ from time functions like sound in that
they
are two dimensional. Film has the same MTF in any
direction, but
not lenses.
- Multiply
the frequency components of the signal by the frequency response (or
MTF)
of the component or system.
- Inverse
transform the signal back into time or spatial domain.
Doing
this in time or spatial domain requires a cumbersome mathematical
operation
called convolution. If you try it, you'll know how the word
"convoluted"
originated. And you'll know for sure why frequency domain is widely
appreciated.
Resolution
of an imaging system (old definition)—
Using the assumption that resolution is a frequency where MTF
is
10% or less, the resolution
r of a system consisting of n
components, each of which has an MTF curve similar to those shown
below,
can be approximated by the equation, 1/r = 1/r1
+ 1/r2 + ... + 1/rn
(equivalently,
r
= 1/(1/r1 + 1/r2
+ ... + 1/rn
)). This equation is adequate as a
first order estimate,
but not as accurate as
multiplying MTF's. [I
verified it with a bit of mathematics, assuming a second order MTF
rolloff
typical of the curves below. It's not sensitive to the MTF percentage
that
defines
r. The approximation, 1/r2
= 1/r12
+ 1/r22
+ ..., is not accurate.]
A
virtual chart for visualizing MTF
To visualize
the effects of MTF, we
have created a virtual target 0.5
mm in length, shown greatly enlarged on the right. The target consists
of a sine pattern and a bar pattern, both of which start at a low
spatial
frequency, 2 line pairs per millimeter (lp/mm) on the left, and
increase
logarithmically to 200 lp/mm on the right.
The mathematics for generating this function
is rather tricky. It is discussed at the end of part
2.
The
red
curve below the image represents
the tonal densities (0 and 1) of the bar pattern. The vertical
scale—
10
0 through 10
2—
is for the MTF curves to
come, not for the tonal density plot.
[
画像:Input image for MTF visualization]
[
画像:MTF illustration for Fujichrome Velvia, excellent 35mm lens]
The plot on the left illustrates the response of the virtual
target
to the combined effects of an excellent lens (a simulation of the
highly-regarded
Canon
28-70mm f/2.8L) and film (a simulation of Velvia). Both the
sine and
bar patterns (original and response) are shown. You'll find these plots
throughout this series as we simulate lenses, film, scanners,
sharpening,
and finally, digital cameras.
The red
curve is the spatial response
of the bar pattern to the film + lens. The blue
curve is the combined MTF, i.e., the spatial frequency
response
of the film + lens, expressed in percentage of low frequency response,
indicated on the scale on the left. (It goes over 100% (102).)
The thin
blue
dashed curve is the MTF
of the lens only.
The edges in the
bar pattern have been broadened,
and there are small peaks on either side of the edges. The shape of the
edge is inversely related to the MTF response: the more extended the
MTF
response, the sharper (or narrower) the edge. The mid-frequency boost
of
the MTF response is related to the small peaks on either side of the
edges.
The leftmost edge in the plot is a portion
of
the
step response of the system (film + lens). A much
lower spatial
frequency is required to represent it properly. The impulse
response— the
response of the system to a narrow line (or impulse) is
also of interest. The impulse response is the derivative
of the
step response (d(step response)/dx).
The MTF
curve is related to the impulse response
by a mathematical operation known as the Fourier transform
( F
),
which is well-known to engineers and physicists.
MTF
response = F(impulse response)
impulse
response = F-1(MTF response)
F-1
is the inverse Fourier
transform. We'll spare the gentle reader from further equations— the topic
is quite understandable without them.
The image
above represents only 0.5 mm of film, but takes up
around
5 inches (13 cm) on my monitor. At this magnification (260x),
a full frame 35mm image (24x36mm)
would
be 240 inches (6.2 meters) high and 360 inches (9.2 meters) wide. A bit
excessive, but if you stand back from the screen you'll get an feeling
for the effects of the lens, film, scanner (or digital camera), and
sharpening
on real images.
The companion website,
Imatest.com,
describes a software tool you can use to measure MTF and other factors
that contribute to image quality in digital cameras and digitized film
images.
Links
to general articles on MTF
Understanding
MTF: The Modulation Transfer Function Explained by Michael
Reichmann
of
Luminous-landscape.com.
Excellent introduction.
What
is an MTF ...and Why Should You Care? by
Don
Williams of Eastman Kodak.
How to
interpret MTF graphs
by Klaus Schroiff. Another useful explanation.
Photodo
has
several excellent articles on MTF and image quality. Recommended.
MTF
Engineering
Notes from Sine Patterns LLC, a purveyor of lens
test charts.
Lots of equations.
Image
Processing page from efg
(Earl F. Glynn)
Serious links to (mostly) serious academic literature. Fascinating for
geeks. Click
here
if the link doesn't work.
R.
N.
Clark's scanner detail page is required reading for anyone
interested
in image sharpness. It presents much of the material covered here from
a different viewpoint: real images.
An
Evaluation of the Current State of Digital Photography by
Charles Dickinson.
RIT
bachelor's thesis, 1999. Uses MTF analysis.
Introduction
to Electronic Imaging Systems Class notes from ECE
102, Center
for Electronic Imaging Systems, University of Rochester. Taught by
Dr.
Michael Kriss. Connected with the
U
of R Image Processing Lab.
RIT
Center
for Imaging Science class material is a serious
resource— well worth
exploring.
Basic
Principles
of Imaging Science 1. Lectures
17
and
18
on MTF and imaging microstructure are particularly interesting.
Human
visual acuity
The
ability of the eye to resolve detail is known as "visual acuity." The
normal human eye can distinguish patterns of alternating black and
white
lines with a feature size as small as one minute of an arc (1/60 degree
or π/(60*180) = 0.000291 radians).
That,
incidentally, is the definition
of 20-20 vision. A few exceptional eyes may be able to distinguish
features
half this size. But for most of us, a pattern of higher spatial
frequency
will appear nearly pure gray. Low contrast patterns at the maximum
spatial
frequency will also appear gray.
At
a distance d from the eye (which has a nominal
focal length of 16.5
mm), this corresponds to objects of length = (angle in radians)*d
= 0.000291*d. For
example, for an object viewed at a distance of
25 cm (about 10 inches), the distance you might use for close scrutiny
of an 8x10 inch
photographic print,
this would correspond to 0.0727 mm = 0.0029 inches. Since a line pair
corresponds
to two lines of this size, the corresponding spatial frequency is 6.88
lp/mm or 175 lp/inch. Assume now that the image was printed from a 35mm
frame enlarged 8x.
The corresponding
spatial frequency on the film would be 55 lp/mm.
This means that for an 8x10
inch print, the MTF of a 35mm camera
(lens + film, etc.) above 55 lp/mm, or the MTF of a digital
camera above 2800 LW/PH (Line Widths per Picture
Height) measured
by Imatest SFR,
has no effect on the appearance
of the print.
That's why
the highest spatial frequencies used in manufacturer's MTF charts is
typically 40 lp/mm, which
provides an excellent indication of a lens's perceived sharpness in an
8x10 inch print
enlarged 8x.
Of course higher spatial frequencies are of interest for larger prints.
Standard Depth of
Field (DOF) scales on lenses
are based on the assumption, made in the 1930s, that the smallest
feature
of importance, viewed at 25 cm, is 0.01 inches— 3 times
larger. It shouldn't
be a surprise that focus isn't terribly sharp at the DOF limits. See
the
DOF
page for more details.
The statement that the eye cannot distinguish
features smaller than
one minute of an arc is, of course, oversimplified. The eye has an MTF
response, just like any other optical component. It is illustrated on
the
right from the Handout #9: Human
Visual Perception from Stanford
University course EE368B - Image and Video Compression by
Professor
Bernd Girod. The horizontal axis is angular frequency in cycles per
degree (CPD). MTF is
shown for pupil sizes from 2 mm (bright lighting; f/8), to 5.8 mm (dim
lighting; f/2.8). At 30 CPD, corresponding to a one minute of an arc
feature
size, MTF drops from 0.4 for the 2 mm pupil to 0.16 for the 5.8 mm
pupil.
(Now you know your eye's f-stop range. It's similar to compact digital
cameras.) Another
Stanford page has Matlab computer models of the eye's MTF.
[
画像:The eye's contrast sensitivity function] The
human eye's MTF, which is limited at
high angular frequencies by the
eye's optical system and cone density, does not tell the whole story of
the eye's response. Neuronal interactions such
as lateral inhibition
limit the eye's response at
low
angular frequencies, i.e., the eye is insensitive to very gradual
changes in density. The eye's overall response is called its
contrast
sensitivity function (CSF). Various
studies place
the peak CSF for bright light levels (typical of print viewing
conditions) between 6 and 8 cycles per degree. The graph on the left
uses an approximation (equations below) that peaks just below 8
cycles/degree.
CSF is used in measures of
perceptual image sharpness called
Acutance and
Subjective
Quality Factor (SQF) , which
includes MTF, CSF, print size, and typical viewing distance.
SQF has been used since the 1970s inside Kodak and Polaroid, but it was
difficult to calculate, and hence remained obscure, until it was
incorporated into
Imatest in 2006.
The following formula for CSF is
relatively simple, recent, and
fits the data well. The source is J. L. Mannos, D. J.
Sakrison, ``The Effects of a Visual Fidelity Criterion on the Encoding
of Images'', IEEE Transactions on Information Theory, pp. 525-535, Vol.
20, No 4, (1974), cited on
this page of
Kresimir Matkovic's 1998 PhD
thesis.
CSF(
f ) =
2.6 (0.0192 + 0.114 f )
exp(-0.114 f )1.1
The 2.6 multiplier can be removed and the
equation can be simplified
somewhat. The dc term (0.0192) can be dropped with very little effect.
CSF(
f ) = (0.0192 + 0.114 f )
exp(-0.1254 f )
Additional
explanations of human visual
acuity can be found on pages
from the Nondestructive
testing resource center and Stanford
University. Page 3 from Stanford has a plot of the MTF of the
human
eye. I believe the x-axis units (CPD) are Cycles per Degree, where a
pair
of 1/60 degree features corresponds to 30 CPD.
Images
and text copyright ゥ 2000-2013 by Norman Koren. Norman Koren lives
in Boulder, Colorado, where he worked in developing magnetic recording
technology for high capacity data storage systems until 2001. Since 2003 most of his time has been devoted to the development of
Imatest. He has been involved with photography since 1964.