Recall the definition of convolution of integrable functions. In the case of good enough functions in real spaces, one is able to perform this operation by passing to the Fourier domain, multiplying, and coming back through the inverse Fourier transform. In the setting of digital signals one can also profit from this principle, but by using instead the discrete and inverse discrete Fourier transforms. With the aid of the fft
algorithms to calculate the discrete Fourier transform, convolution via the frequency domain can be faster than directly convolving the time domain signals (scipy.ndimage.convolve
in sage
), and the final result is the same except maybe at the borders of the image. Some care must be taken when producing the convolution this way, since zero-padding needs to be applied.
The following example shows several ways to proceed, and offers some intuition on how these approaches differ to each other:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
from numpy import *
import scipy.ndimage
import scipy.signal
image = float32(random_matrix(ZZ, 6, 6))
kern = float32(random_matrix(ZZ, 3, 3))
result1 = scipy.ndimage.convolve(image,kern)
result2 = fft.ifft2( multiply(fft.fft2(image), fft.fft2(kern,image.shape) )
result3 = scipy.signal.fftconvolve(image,kern)
l, w = image.shape[0] + kern.shape[0] - 1, image.shape[1] + kern.shape[1] - 1
result4 = fft.ifft2( multiply(fft.fft2(image,[l,w]), fft.fft2(kern,[l,w])) )
`result1` | `real(result2)` |
`real(result3[1:7,1:7])` | `real(result4[1:7,1:7])` |
[Note: Both result1
and result2
are matrices with size \( 6\times 6, \) while the matrices result3
and result4
, which are identical, have size \( 8\times 8. \)]
Another word of advise: one might feel tempted to create a Gaussian of the same size and shape of the original image. For example, if the original image has size \( 512 \times 512, \) to create the Gaussian \( G_{16} \) one could proceed in sage
as follows:
1
2
3
4
5
6
7
8
9
10
11
12
from numpy import *
import scipy
# Create a 512x512 radial matrix
[X,Y] = mgrid[-256:256, -256:256]
A = X + i*Y
R=abs(A)
# Create a full(!?) 512x512 gaussian kernel Gt
t = 16.0
Gt = exp(-multiply(R,R)/4.0/t)/4.0/pi/t
Gt = float32(Gt)
The procedure of convolution of a noisy version of an image with this kernel could be done as follows:
1
2
3
4
5
6
7
8
9
10
# Consider the lena image
image = scipy.misc.lena().astype(float32)
# Add some white Gaussian noise mu=0 sigma=16
NoisyImage = image + random.normal(0, 16, size=image.shape)
NoisyImageBase = multiply( NoisyImage>=0 , NoisyImage<=255 )
NoisyImage = multiply( NoisyImage, NoisyImageBase) + 255*(NoisyImage>255.0)
# Compute the convolution of lena with Gt.
convImageGt = scipy.ndimage.convolve(NoisyImage, Gt)
`original: image` | `image + noise` | `convImageGt` |
Granted: It accomplishes the requested operation, but at the expense of too many computations and storing unnecessary data—especially in cases where the original image contains millions of pixels!
Note the structure of the Gaussian kernel \( G_{1} \) (below, left). This function is effectively zero more than about three standard deviations from the mean, so it does not make sense to store more than a small center window (below, right). We therefore collect the important values in a small matrix and use it as a kernel instead, at the time of computing the convolution.