Convolution and correlation
\( \newcommand{\Conv}{\mathop{\rm Conv}} \newcommand{\Corr}{\mathop{\rm Corr}} \)
These notes contain basic information about two mathematical operations that are commonly used in signal processing, convolution and correlation. Convolution is used to linearly filter a signal. Correlation is used to characterize the statistical dependencies between two signals.
Convolution
Consider two time series, and , where the index runs from to . Their convolution is defined as
This definition is applicable to time series of infinite length. If and are finite, they can be extended to infinite length by adding zeros at both ends. After this trick, called zero padding, the definition in Eq. (\ref{eq:ConvolutionInfinite}) becomes applicable. For example, Eq. (\ref{eq:ConvolutionInfinite}) reduces to
for the finite time series .
Exercise. Convolution is commutative and associative. Prove that and .
Exercise. Convolution is distributive over addition. Prove that . This means that filtering a signal via convolution is a linear operation.
Although and are treated symmetrically by convolution, they usually have very different meanings. Typically, one is a signal that goes on for a long time. The other is concentrated near time zero, and is called a kernel or filter. The output of the convolution is also a signal, a filtered version of the input signal.
In Eq. (\ref{eq:ConvolutionFinite}), we chose to be zero for all negative . This is called a causal filter, because is affected by in the present and past, but not in the future. In some contexts, the causality constraint is not important, and one can take to be nonzero, for example.
Formulas are nice and compact, but now let’s draw some diagrams to see how convolution works. To take a concrete example, assume a causal filter . Then the th component of the convolution involves aligning and this way:
In words, is computed by looking at the signal through a window of length starting at time and extending back to time . The weighted sum of the signals in the window is taken, using the coefficients given by .
Note that is left-right “flipped” in the diagrams. This is because of the minus sign that appears in the indices of convolution. The flipping may seem counterintuitive, but it’s necessary to make convolution commutative. Also it means the kernel can be interpreted as the impulse response function, which will be defined shortly.
If the signal has a finite length, then the diagram looks different when the window sticks out over the edges. Consider a signal . Let’s consider the two extreme cases where the window includes only one time bin in the signal. One extreme is , which can be visualized as
The other extreme is , which can be visualized as
The range has length . Outside this “full” range of the convolution, the elements of must vanish.
The range has length . Inside this “valid” range, the kernel does not stick out beyond the borders of the signal; the kernel never encounters the zero padding of the signal. (Here we are assume that the signal is at least as long as the kernel , .)
Using the conv
function in Julia and MATLAB.
Suppose that , where and .
Then conv(g,h)
returns the “full” convolution .
It is the only option for Julia and the default option in MATLAB.
Alternatively, conv(g,h,"valid")
returns the “valid” convolution “. This extension to Julia’s built-in conv
is provided in the sample code accompanying the homework. The “valid” option is available in MATLAB.
The Numpy function is called convolve
, and has both “full” and “valid” options like MATLAB.
Note that the inputs to the conv
function don’t include information about absolute time.
In general, the inputs could be interpreted as and .
Then the output should be interpreted as . In other words, shifting either or
in time is equivalent to shifting in time.
For example, suppose that is a signal, and represents an
acausal filter, with and . Then the first element of
returned by conv
is , and the last is
. So discarding the first and last
elements of leaves us with . This is
time-aligned with the signal , and has the
same length.
Impulse response
Consider the signal consisting of a single impulse at time zero,
The convolution of this signal with a kernel is
which is just the kernel again. In other words , is the response of the kernel to an impulse, or the impulse response function. If the impulse is displaced from time 0 to time , then the result of the convolution is the kernel , displaced by time steps.
Elsewhere you may have seen the ``Kronecker delta’’ notation , which is equivalent to . The Kronecker delta is just the identity matrix, since it is equal to one only for the diagonal elements . You can think about convolution with as multiplication by the identity matrix . More generally, the following exercise shows that convolution is equivalent to multiplication of a matrix and a vector.
Exercise. Matrix form of convolution. Show that the convolution of and can be written as
where the matrix is defined by
and and are treated as column vectors.
Exercise. Each column of is the same time series, but shifted by a different
amount. Use the MATLAB function convmtx
to create matrices
like from time series like .
If you don’t have this toolbox installed, you can make use of the fact
that Eq. (\ref{eq:ConvolutionMatrix}) is a Toeplitz matrix, and can
be constructed by giving its first column and first row to the
toeplitz
command in MATLAB.
Exercise. Convolution as polynomial multiplication. If the second degree polynomials and are multiplied together, the result is a fourth degree polynomial. Let’s call this polynomial . Show that this is equivalent to .