Introduction to Computer Vision
Contents
1 Interpolation 2
2 Point operators 3
3 Histogram based image operations 3
4 Least Squares Estimators 4
5 Geometric operators 6
6 Homogeneous coordinates 6
7 Local operators 8
8 Local structure 10
9 Image stitching using SIFT 13
10 Pinhole camera 14
11 Convolutional neural networks 16
12 Motion 16
1
,Introduction to Computer Vision Summary CHoogteijling
1 Interpolation
Interpolation allows us to find the value of a function in between the points where the image is
sampled.
Nearest neighbor interpolation is, given the samples F (k), the value of the interpolated function
fˆ at coordinate x. The function is not continuous nor differentiable.
1
fˆ(x) = F (⌊x + ⌋)
2
Linear interpolation is, between adjacent sample points k and k + 1, we assume the function is a
linear function. The function is continuous but not differentiable in the sample points. The second
and higher order derivatives are equal to zero
k ≤ x ≤ k + 1 : fˆ(x) = (1 − (x − k))F (k) + (x − k)F (k + 1)
With cubic interpolation we look at two pixels on the left and two on the right. To interpolate
the function value f (x) for x in between x = k and x = k + 1 we fit a cubic polynomial to the
sample points {k − 1, k, k + 1, k + 2}.
k ≤ x ≤ k + 1 : fˆ(x) = a(x − k)3 + b(x − k)2 + c(x − k) + d
There are better interpolation methods, for example using more samples. A disadvantage of higher
order polynomials is overfitting of the original function: higher order polynomials tend to fluctuate
wildly in between the sample points.
For 2D functions we can also use nearest neighbor, cubic and spline interpolation. We first
interpolate in the x-direction and then in the y-direction.
Extrapolation allows us to find the value of a function outside the domain of the image. To find
the value we can:
• Set the value of a point outside the bounds of the grid to a constant value (often zero).
• Pick a point that is within the bounds of the grid and use the (interpolated) value in that
point.
– Closest point. We select the point inside the bounds of the grid that is closest to the
outside point.
– Mirrored point. We mirror the outside point in the vertical line through the last
sample points in horizontal direction of the grid.
– Wrapping. We select the same point from inside the bounds. Imagine a tiled wall and
each tile showing the same image. This is what the discrete Fourier transform implicitly
assumes.
2
, Introduction to Computer Vision Summary CHoogteijling
1.1 Image histograms
A histogram of all possible scalar pixel values in an image provides a summary of the distribution
of the values over all possible values. There is no science behind choosing an appropriate bin size.
One rule of thumb is Sturges’ rule k = ⌈log2 n⌉ + 1.
The function for an univariate histogram is:
X
hf [i] = [ei ≤ f (x) < ei+1 ]
x∈E
To capture a histogram of data that is multi-dimensional, we can compute a multivariate his-
togram.
X
hf [i, j, k] = [e1,i ≤ f1 (x) < e1,i+1 ][e2,j ≤ f2 (x) < e2,j+1 ][e3,k ≤ f3 (x) < e3,k+1 ]
x∈E
2 Point operators
A point operator γ is a function that constructed by pointwise lifting a value operator to the
image domain. For two images f : D → R and g : D → R′ . Let γ : R × R′ → R′′ be an operator.
The operator γ can be lifted to work on images.
∀x ∈ D : γ(f, g)(x) = γ(f (x), g(x))
α-blending takes the weighted average of two images. Let f and g be two colour images defined
on the same spatial domain. A sequence of images that shows smooth transition from f to g can
be obtained by the following equation for α-values increasing from 0 to 1.
hα = (1 − α)f + αg
Unsharp masking uses alpha-blending to sharpen an image. Let f be an image and g an unsharp
version of the image. The result is obtained by adding β times the difference of f − g to the original
image.
h = f + β(f − g)
Image thresholding uses a relational operator. Let f be a scalar image, then [f > t], for constant
t, results in a binary image.
3 Histogram based image operations
A Monadic point operator is an operator that changes the pixel value f (x) independent of the
position x and independent of all other pixel values in the neighbourhood.
3