Tools |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Certain
tools are central to the processing of digital images. These include
mathematical tools such as convolution, Fourier analysis,
and statistical descriptions, and manipulative tools such
as chain codes and run codes. We will
present these tools without any specific motivation. The motivation will
follow in later sections.
Convolution
There are several possible notations to indicate the convolution of two (multi-dimensional) signals to produce an output signal. The most common are:
We
shall use the first form,
, with the
following formal definitions. In
2D continuous space:
In
2D discrete space:
Properties of Convolution
There are a number of important mathematical properties associated with convolution. *
Convolution is commutative.
*
Convolution is associative.
*
Convolution is distributive.
where
a, b, c, and d are all images, either
continuous or discrete.
Fourier Transforms
The Fourier transform produces another representation of a signal, specifically a representation as a weighted sum of complex exponentials. Because of Euler's formula:
where
, we can say
that the Fourier transform produces a representation of a (2D) signal as
a weighted sum of sines and cosines. The defining formulas for the
forward Fourier and the inverse Fourier transforms are as follows. Given
an image a and its Fourier transform A, then the forward
transform goes from the spatial domain (either continuous or discrete)
to the frequency domain which is always continuous. Forward - The
inverse Fourier transform goes from the frequency domain back to the
spatial domain. Inverse - The
Fourier transform is a unique and invertible operation so that:
The
specific formulas for transforming back and forth between the spatial
domain and the frequency domain are given below. In
2D continuous space: Forward - Inverse - In
2D discrete space: Forward - Inverse - Properties of Fourier Transforms
There
are a variety of properties associated with the Fourier transform and
the inverse Fourier transform. The following are some of the most
relevant for digital image processing. *
The Fourier transform is, in general, a complex function of the real
frequency variables. As such the transform can be written in terms of
its magnitude and phase.
*
A 2D signal can also be complex and thus written in terms of its
magnitude and phase.
*
If a 2D signal is real, then the Fourier transform has certain
symmetries.
The
symbol (*) indicates complex conjugation. For real signals eq. leads
directly to:
*
If a 2D signal is real and even, then the Fourier transform is real and
even.
*
The Fourier and the inverse Fourier transforms are linear operations.
where
a and b are 2D signals (images) and w1
and w2 are arbitrary, complex constants. *
The Fourier transform in discrete space, A(
,
), is periodic in both
and
. Both periods are 2
.
*
The energy, E, in a signal can be measured either in the spatial
domain or the frequency domain. For a signal with finite energy: Parseval's
theorem (2D continuous space):
Parseval's
theorem (2D discrete space):
This
"signal energy" is not to be confused with the physical energy
in the phenomenon that produced the signal. If, for example, the value a[m,n]
represents a photon count, then the physical energy is
proportional to the amplitude, a, and not the square of the
amplitude. This is generally the case in video imaging. *
Given three, multi-dimensional signals a, b, and c
and their Fourier transforms A, B, and C:
In
words, convolution in the spatial domain is equivalent to multiplication
in the Fourier (frequency) domain and vice-versa. This is a central
result which provides not only a methodology for the implementation of a
convolution but also insight into how two signals interact with each
other--under convolution--to produce a third signal. We shall make
extensive use of this result later. *
If a two-dimensional signal a(x,y) is scaled in its
spatial coordinates then:
*
If a two-dimensional signal a(x,y) has Fourier
spectrum A(u,v) then:
*
If a two-dimensional signal a(x,y) has Fourier
spectrum A(u,v) then:
Importance of phase and magnitude
Equation indicates that the Fourier transform of an image can be complex. This is illustrated below in Figures 4a-c. Figure 4a shows the original image a[m,n], Figure 4b the magnitude in a scaled form as log(|A( , )|), and Figure 4c the phase ( , ).
Figure 4a Figure 4b Figure 4c Original log(|A( , )|) ( , ) Both
the magnitude and the phase functions are necessary for the complete
reconstruction of an image from its Fourier transform. Figure 5a shows
what happens when Figure 4a is restored solely on the basis of the
magnitude information and Figure 5b shows what happens when Figure 4a is
restored solely on the basis of the phase information. Figure 5a Figure 5b ( , ) = 0 |A( , )| = constant Neither
the magnitude information nor the phase information is sufficient to
restore the image. The magnitude-only image (Figure 5a) is
unrecognizable and has severe dynamic range problems. The phase-only
image (Figure 5b) is barely recognizable, that is, severely degraded in
quality. Circularly symmetric signals
An arbitrary 2D signal a(x,y) can always be written in a polar coordinate system as a(r, ). When the 2D signal exhibits a circular symmetry this means that:
where
r2 = x2 + y2 and
tan
= y/x. As a number of physical
systems such as lenses exhibit circular symmetry, it is useful to be
able to compute an appropriate Fourier representation. The
Fourier transform A(u, v) can be written in polar
coordinates A(
r,
) and then, for a circularly symmetric signal,
rewritten as a ankel transform:
where
and Jo(*)
is a Bessel function of the first kind of order zero. The
inverse ankel transform is given by:
The
Fourier transform of a circularly symmetric 2D signal is a function of
only the radial frequency,
r.
The dependence on the angular frequency,
, has vanished. Further, if a(x,y)
= a(r) is real, then it is automatically even due to the
circular symmetry. According to equation , A(
r)
will then be real and even. Examples of 2D signals and transforms
Table 4 shows some basic and useful signals and their 2D Fourier transforms. In using the table entries in the remainder of this chapter we will refer to a spatial domain term as the point spread function (PSF) or the 2D impulse response and its Fourier transforms as the optical transfer function (OTF) or simply transfer function. Two standard signals used in this table are u(*), the unit step function, and J1(*), the Bessel function of the first kind. Circularly symmetric signals are treated as functions of r as in eq. .
Statistics
In
image processing it is quite common to use simple statistical
descriptions of images and sub-images. The notion of a statistic is
intimately connected to the concept of a probability distribution,
generally the distribution of signal amplitudes. For a given
region--which could conceivably be an entire image--we can define the
probability distribution function of the brightnesses in that
region and the probability density function of the brightnesses
in that region. We will assume in the discussion that follows that we
are dealing with a digitized image a[m,n]. Probability distribution function of the brightnesses
The probability distribution function, P(a), is the probability that a brightness chosen from the region is less than or equal to a given brightness value a. As a increases from - to + , P(a) increases from 0 to 1. P(a) is monotonic, non-decreasing in a and thus dP/da >= 0. Probability density function of the brightnesses
The probability that a brightness in a region falls between a and a+ a, given the probability distribution function P(a), can be expressed as p(a) a where p(a) is the probability density function:
Table 4: 2D Images and their Fourier Transforms Because of the monotonic, non-decreasing character of P(a) we have that:
For
an image with quantized (integer) brightness amplitudes, the
interpretation of
a
is the width of a brightness interval. We assume constant width
intervals. The brightness probability density function is
frequently estimated by counting the number of times that each
brightness occurs in the region to generate a histogram, h[a].
The histogram can then be normalized so that the total area under the
histogram is 1 (eq. ). Said another way, the p[a] for a
region is the normalized count of the number of pixels,
, in a region that have quantized brightness
a:
The
brightness probability distribution function for the image shown
in Figure 4a is shown in Figure 6a. The (unnormalized) brightness
histogram of Figure 4a which is proportional to the estimated brightness
probability density function is shown in Figure 6b. The height in this
histogram corresponds to the number of pixels with a given brightness.
(a) (b) Figure
6:
(a) Brightness distribution function of Figure 4a with minimum, median,
and maximum indicated. See text for explanation. (b) Brightness
histogram of Figure 4a. Both
the distribution function and the histogram as measured from a region
are a statistical description of that region. It must be emphasized that
both P[a] and p[a] should be viewed as estimates
of true distributions when they are computed from a specific region.
That is, we view an image and a specific region as one realization of
the various random processes involved in the formation of that image and
that region. In the same context, the statistics defined below must be
viewed as estimates of the underlying parameters. Average
The average brightness of a region is defined as the sample mean of the pixel brightnesses within that region. The average, ma, of the brightnesses over the pixels within a region ( ) is given by:
Alternatively,
we can use a formulation based upon the (unnormalized) brightness
histogram, h(a) =
*p(a), with discrete brightness
values a. This gives:
The
average brightness, ma, is an estimate of the mean
brightness, ua, of the underlying brightness probability
distribution. Standard deviation
The unbiased estimate of the standard deviation, sa, of the brightnesses within a region ( ) with pixels is called the sample standard deviation and is given by:
Using
the histogram formulation gives:
The
standard deviation, sa, is an estimate of
a
of the underlying brightness probability distribution. Coefficient-of-variation
The dimensionless coefficient-of-variation, CV, is defined as:
Percentiles
The percentile, p%, of an unquantized brightness distribution is defined as that value of the brightness a such that: P(a) = p% or equivalently
Three
special cases are frequently used in digital image processing. *
0% the minimum value in the region *
50% the median value in the region *
100% the maximum value in the region All
three of these values can be determined from Figure 6a. Mode
The mode of the distribution is the most frequent brightness value. There is no guarantee that a mode exists or that it is unique. SignaltoNoise ratio
The signal-to-noise ratio, SNR, can have several definitions. The noise is characterized by its standard deviation, sn. The characterization of the signal can differ. If the signal is known to lie between two boundaries, amin <= a <= amax, then the SNR is defined as: Bounded signal - If
the signal is not bounded but has a statistical distribution then two
other definitions are known: Stochastic signal - S & N inter-dependent S & N independent where
ma and sa are defined above. The
various statistics are given in Table 5 for the image and the region
shown in Figure 7. Figure 7 Table 5 Region is the interior of the circle. Statistics from Figure 7 A
SNR calculation for the entire image based on eq. is not
directly available. The variations in the image brightnesses that lead
to the large value of s (=49.5) are not, in general, due to noise
but to the variation in local information. With the help of the region
there is a way to estimate the SNR. We can use the s
(=4.0) and the dynamic range, amax - amin,
for the image (=241-56) to calculate a global SNR (=33.3 dB). The
underlying assumptions are that 1) the signal is approximately constant
in that region and the variation in the region is therefore due to
noise, and, 2) that the noise is the same over the entire image with a
standard deviation given by sn = s
.
Contour Representations
When
dealing with a region or object, several compact representations are
available that can facilitate manipulation of and measurements on the
object. In each case we assume that we begin with an image
representation of the object as shown in Figure 8a,b. Several techniques
exist to represent the region or object by describing its contour. Chain code
This representation is based upon the work of Freeman . We follow the contour in a clockwise manner and keep track of the directions as we go from one contour pixel to the next. For the standard implementation of the chain code we consider a contour pixel to be an object pixel that has a background (non-object) pixel as one or more of its 4-connected neighbors. See Figures 3a and 8c. The
codes associated with eight possible directions are the chain codes and,
with x as the current contour pixel position, the codes are
generally defined as:
Figure
8:
Region (shaded) as it is transformed from (a) continuous to (b) discrete
form and then considered as a (c) contour or (d) run lengths illustrated
in alternating colors. Chain code properties
* Even codes {0,2,4,6} correspond to horizontal and vertical directions; odd codes {1,3,5,7} correspond to the diagonal directions. *
Each code can be considered as the angular direction, in multiples of 45deg.,
that we must move to go from one contour pixel to the next. *
The absolute coordinates [m,n] of the first contour pixel
(e.g. top, leftmost) together with the chain code of the contour
represent a complete description of the discrete region contour. *
When there is a change between two consecutive chain codes, then the
contour has changed direction. This point is defined as a corner.
Crack code
An alternative to the chain code for contour encoding is to use neither the contour pixels associated with the object nor the contour pixels associated with background but rather the line, the "crack", in between. This is illustrated with an enlargement of a portion of Figure 8 in Figure 9. The
"crack" code can be viewed as a chain code with four possible
directions instead of eight.
(a) (b) Figure
9:
(a) Object including part to be studied. (b) Contour pixels as used in
the chain code are diagonally shaded. The "crack" is shown
with the thick black line. The
chain code for the enlarged section of Figure 9b, from top to bottom, is
{5,6,7,7,0}. The crack code is {3,2,3,3,0,3,0,0}. Run codes
A third representation is based on coding the consecutive pixels along a row--a run--that belong to an object by giving the starting position of the run and the ending position of the run. Such runs are illustrated in Figure 8d. There are a number of alternatives for the precise definition of the positions. Which alternative should be used depends upon the application and thus will not be discussed here. |