1. Introduction
Sampling a continuous signal is the operation of taking samples of the signal to obtain a discrete signal, that is to say a series of numbers representing the signal, in order to store, transmit, or process the signal.
Sampling takes place in the analog-to-digital conversion operation, for example in a sound or image digitization device. Another example of sampling is that which is done to obtain the graphical representation of a function with one or two variables. Generally speaking, sampling is involved in any continuous / discrete conversion operation.
This document presents Shannon’s sampling theorem, which makes it possible to know at what minimum frequency a signal must be sampled so as not to lose the information it contains. We will also see how the reconstruction of a continuous signal is carried out from the samples, an operation which takes place in the digital-analog conversion.
To explain sampling and reconstruction, we must use spectral analysis and the discrete Fourier transform, discussed in the document Introduction to spectral analysis.
We will be interested in a temporal signal represented by a function u (t), where t is the time, but the results are easily transposed to the cases of functions of other variables, for example space variables. In particular, the results will be usable for the sampling of an image, that is to say a function I (x, y) of two space variables.
For simplicity, we will limit ourselves to the case of periodic signals.
2. Sampling
2.a. Shannon’s theorem
Let u (t) be a function representing a continuous signal. We consider a periodic sampling defined by: tk = kTe (1) uk = u (tk) (2)
where k is an integer. Te is the sampling period. fe = 1 / Te is the sampling frequency.
Shannon’s theorem ([1]) concerns signals whose spectrum has a maximum frequency fmax, which are called band-limited signals. For example, if u (t) is a trigonometric polynomial, the maximum frequency is that of the greatest harmonic.
Shannon’s theorem: so that the signal can be completely reconstructed from the samples, it is necessary and sufficient that: fe> 2fmax (3)
The sampling frequency must be strictly greater than twice the greatest frequency present in the spectrum of the continuous signal (Nyquist-Shannon condition). If this condition is true then: u (t) = ∑k = -∞ + ∞uksinct-kTeTe (4)
where the cardinal sine function is defined by: sinc (x) = sin (πx) πx (5)
This relationship shows that the signal can be reconstructed from the samples, which means that all of the information present in the original signal is retained in the samples. We will see later how the reconstruction operation is carried out in practice.
Half of the sampling frequency is called the Nyquist frequency fn and the Nyquist-Shannon condition is therefore written fmax <fn.
When the condition is not verified, it is said that there is under-sampling. We speak of oversampling when the Nyquist frequency is much greater than fmax.
2.b. Sinusoidal function
To illustrate Shannon’s theorem, let us first consider the case of a sinusoidal function. We define a period 1 function:
import math def u(t): return math.sin(2*math.pi*t)
The maximum frequency is obviously fmax = 1. We will perform two samples of this function. The first with a large frequency in front of 1, to draw the sinusoid, the second with a lower frequency but respecting the Nyquist-Shannon condition (greater than 2)
import numpy from matplotlib.pyplot import * T=40.0 fe1 = 100.0 te1 = 1.0/fe1 N1 = int(T*fe1) t1 = numpy.arange(0,N1)*te1 x1 = numpy.zeros(N1) for k in range(N1): x1[k] = u(t1[k]) fe2 = 2.234 te2 = 1.0/fe2 N2 = int(T*fe2) t2 = numpy.arange(0,N2)*te2 x2 = numpy.zeros(N2) for k in range(N2): x2[k] = u(t2[k]) figure(figsize=(12,4)) plot(t1,x1,"b") stem(t2,x2,"r") xlabel("t") ylabel("u") grid()

If we connect the samples by segments, we get of course a very bad representation of the sinusoid:
figure(figsize=(12,4)) plot(t2,x2,"r") xlabel("t") ylabel("u") grid()

According to Shannon’s theorem, however, it is possible to completely reconstruct the signal. To do this, we will instead place ourselves in frequency space, by calculating the discrete Fourier transform of the samples.
import numpy.fft tfd = numpy.fft.fft(x2) f = numpy.arange(0,N2)*1.0/T figure(figsize=(10,5)) stem(f,numpy.absolute(tfd)/N2) xlabel('f') ylabel('A') grid()
print(N2) --> 89

The spectrum of the discrete signal has two maxima, the first at frequency 1, and its image at frequency 2.234-1. We will cut the discrete Fourier transform into two parts:
tfd_A = tfd[0:N2/2] tfd_B = tfd[N2/2:N2]
You can now increase the sampling frequency by adding zeros between these two parts. The frequency interval between two neighboring points remains 1 / T. The new sample rate is calculated from the total number of points.
nz = 2000 N3 = nz+N2 zeros = numpy.zeros(nz) tfd3 = numpy.concatenate((tfd_A,zeros,tfd_B)) fe3 = N3/T f = numpy.arange(0,N3)*1.0/T figure(figsize=(10,4)) stem(f,numpy.absolute(tfd3)) xlabel('f') ylabel('A') grid()

This operation carried out in the frequency domain amounts to increasing the sampling frequency without losing information. It remains to perform the inverse discrete Fourier transform:
x3 = numpy.fft.ifft(tfd3) t3 = numpy.arange(0,N3)*1.0/fe3 figure(figsize=(10,4)) plot(t3,x3) xlabel('t') ylabel('x3') grid()

Although the result is not perfect, we get a reconstruction of the initial sinusoid. Shannon’s formula (4) applies to a signal not limited in time. In practice, it has a finite duration T, which is why the reconstruction is imperfect. We will see later that the reconstruction is done in practice in the time domain and not in this way.
2.c. Signal périodique
A periodic function is decomposed into a sum of sinusoidal functions (Fourier series). Here is an example with 3 harmonics, of order 1,3 and 5:
def u(t): return 1.0*math.sin(2*math.pi*t)\ +0.5*math.sin(3*2*math.pi*t+math.pi/3)\ +0.3*math.sin(5*2*math.pi*t-math.pi/6)
The largest frequency of the signal spectrum is that of the 5th harmonic. To comply with the Nyquist-Shannon condition, a sampling frequency greater than 10 is therefore necessary. 40 periods are sampled with a frequency of 12.345. The sampling frequency is chosen not a multiple of that of the signal, as is most often in reality.
T=40.0 fe1 = 100 te1 = 1.0/fe1 N1 = int(T*fe1) t1 = numpy.arange(0,N1)*te1 x1 = numpy.zeros(N1) for k in range(N1): x1[k] = u(t1[k]) fe2 = 12.345 te2 = 1.0/fe2 N2 = int(T*fe2) t2 = numpy.arange(0,N2)*te2 x2 = numpy.zeros(N2) for k in range(N2): x2[k] = u(t2[k]) figure(figsize=(12,4)) plot(t1,x1,"b") stem(t2,x2,"r") xlabel("t") ylabel("u") axis([0,5,-2,2]) grid()

Let’s see the spectrum of the discrete signal:
tfd = numpy.fft.fft(x2) f = numpy.arange(0,N2)*1.0/T figure(figsize=(10,5)) stem(f,numpy.absolute(tfd)/N2) xlabel('f') ylabel('A') grid()

We see on this spectrum that the Nyquist-Shannon condition is well respected: the spectrum of the continuous signal, made up of the three lines of frequencies 1, 3 and 5, is indeed obtained on the first half. In other words, the spectrum of the signal and its image do not overlap. As with the sinusoid, it is possible to completely reconstruct the signal from the samples.
2.d. Tape folding
Band aliasing occurs when the Nyquist-Shannon condition is not met. Here is an example of an undersampled sine wave:
def u(t): return math.sin(2*math.pi*t) T=40.0 fe1 = 100.0 te1 = 1.0/fe1 N1 = int(T*fe1) t1 = numpy.arange(0,N1)*te1 x1 = numpy.zeros(N1) for k in range(N1): x1[k] = u(t1[k]) fe2 = 1.51 te2 = 1.0/fe2 N2 = int(T*fe2) t2 = numpy.arange(0,N2)*te2 x2 = numpy.zeros(N2) for k in range(N2): x2[k] = u(t2[k]) figure(figsize=(12,4)) plot(t1,x1,"b") stem(t2,x2,"r") xlabel("t") ylabel("u") grid()

and the spectrum of the discrete signal:
tfd = numpy.fft.fft(x2) f = numpy.arange(0,N2)*1.0/T figure(figsize=(10,5)) stem(f,numpy.absolute(tfd)/N2) xlabel('f') ylabel('A') grid()

The spectrum obtained is always symmetrical with respect to the Nyquist frequency, but the left part does not correspond at all to the spectrum of the continuous signal, since the maximum is found at 0.5 instead of 1. This spectrum in fact comprises a line at the frequency f = 1 of the signal and another at the frequency fe-f. If one seeks to reconstitute the continuous signal starting from these samples, one obtains a sinusoid of frequency fe-f = 0.51, of lower frequency than the initial sinusoid. The appearance of low spurious frequencies is a consequence of downsampling which can be very troublesome. Not only is there a loss of information, but information not present in the original continuous signal appears.
The spectrum obtained can be interpreted by noting that the spectrum of a sampled sinusoid always comprises two lines of frequencies f and fe-f. When the Nyquist-Shannon condition is met f <fe-f. When it is not fe-f <f. We say that the image of the spectrum (the line fe-f) is folded in the frequency band [0, fe / 2], which is why we speak of band aliasing.
2.e. Anti-aliasing filtering
The above shows that downsampling should be absolutely avoided. In the case of analog-to-digital conversion, for example when digitizing sound, the maximum frequency fmax of the signal can be quite large, while the sampling frequency fe is limited by the working rate of the electronic circuit of digitization.
If the maximum practicable sampling frequency is less than 2fmax, one solution consists in carrying out an analog low-pass filtering of the signal before its digitization, so as to remove from its spectrum the frequencies higher than fe / 2. This type of filter is called an anti-aliasing filter. Ideally, an anti-aliasing filter should have a gain of 1 in the passband [0, fe / 2], zero outside. The following figure shows the block diagram of the digitization device comprising the anti-aliasing filter and the analog-to-digital converter:

In reality, the anti-aliasing filter is difficult to achieve. We therefore prefer, when possible, to increase the sampling frequency.
Take the example of the digitization of sound. The human ear perceives sounds up to 20 kHz. To digitize sound for high-fidelity reproduction, it is therefore necessary to use a frequency of at least 40 kHz. Sounds with a frequency above 20 kHz are inaudible but they can be found in the audible band by the phenomenon of band aliasing. It is necessary to apply a low-pass filtering which removes the frequencies above 20 kHz. Consider for example a first order low pass filter with cutoff frequency fc = 20 kHz. The gain has the following form: G (f) = 11 + ffc2 (6)
This filter has a slope of -20 decibel per decade in the attenuated band, which is not sufficient to remove frequencies located just above 20 kHz, for example 25 kHz. If you lower the cut-off frequency, you risk introducing sound distortion. It is therefore necessary to use a much more selective filter, more difficult to achieve, especially if it is necessary to minimize the distortion in the passband.
The solution adopted today for the digitization of sound is that of over-sampling. For example, with a sample rate of 176 kHz, the filter should have a gain of 1 in the [0.20 kHz] band, but need not be very selective. It will suffice that its gain at 96 kHz is low enough to eliminate the frequencies beyond. In fact, the anti-aliasing filter is not even necessary anymore because the microphones naturally perform this filtering. The sound is recorded at a frequency of 44 kHz (for example on audio CD). The change from 192 kHz to 44 kHz is done after applying a very selective digital anti-aliasing filter, much easier to achieve than an analog filter. Here is the block diagram of the complete device:

3. Reconstruction
3.a. Analog smoothing filter
Signal reconstruction takes place during digital-to-analog conversion, for example in an audio CD player. The objective is to reconstruct a continuous signal (analog) as close as possible to the signal whose spectrum is that of the band [0, fe / 2]. Let’s see this on the example of a sinusoid of period 1, which we sample at a frequency greater than 2.
def u(t): return math.sin(2*math.pi*t) T=40.0 fe2 = 8.56 te2 = 1.0/fe2 N2 = int(T*fe2) t2 = numpy.arange(0,N2)*te2 x2 = numpy.zeros(N2) for k in range(N2): x2[k] = u(t2[k]) figure(figsize=(12,4)) stem(t2,x2,"r") xlabel("t") ylabel("u") axis([0,10,-1,1]) grid()

tfd = numpy.fft.fft(x2) f = numpy.arange(0,N2)*1.0/T figure(figsize=(10,5)) stem(f,numpy.absolute(tfd)/N2) xlabel('f') ylabel('A') grid()

The output of a digital-to-analog converter is not made up of points like the discrete signal but of steps. Indeed a circuit called sampler-and-hold maintains the constant output voltage between two samples. We can simulate the effect of the sample and hold. To do this, you have to increase the sampling frequency by a factor of n:
def echantBloqueur(x,fe,n): N = x.size y = numpy.zeros(0) t = numpy.arange(0,N*n)*1.0/(fe*n) for k in range(N): y = numpy.concatenate((y,numpy.ones(n)*x[k])) return (t,y) n=10 N3=N2*n fe3 = fe2*n (t,x3) = echantBloqueur(x2,fe2,n) figure(figsize=(12,4)) plot(t,x3) xlabel('t') axis([0,10,-1,1]) grid()

To reconstruct the original sine wave from this signal, a smoothing filter must be used. This is an analog filter placed after the digital-to-analog converter. From a frequency point of view, the function of this filter is to remove the frequencies of the band [fe / 2, fe], that is to say the frequencies of the image of the spectrum of the analog signal. For this reason, the smoothing filter is also called the anti-image filter. Strictly speaking, it would be necessary to take into account the modification of the spectrum brought by the sample-and-hold ([2]), which we will not do here.

Ideally, the smoothing filter is a low-pass filter whose gain is 1 in the [0, fmax] band (with a phase varying linearly with the frequency), 0 in the [fe / 2, fe] band. If the frequency fmax is close to the Nyquist frequency (half of fe), the smoothing filter is very difficult to achieve (like the anti-aliasing filter). On the contrary, if fmax is low compared to the Nyquist frequency (oversampling), the smoothing filter is very easy to achieve (a simple RC filter is sufficient).
We can simulate the effect of the smoothing filter with a digital FIR filter. For more on this, refer to the document Examples of FIR filters. The cutoff frequency is chosen equal to half the sampling frequency (before increasing the factor n). The (infinite) impulse response of the ideal low pass filter is: gk = 2a sinc (k2a) (7)
where a = fc / fe = 0.5 and the cardinal sine function has been defined above (5). We therefore have: gk = sinc (k) (8)
We obtain precisely the cardinal sine which appears in Shannon’s formula (4). In practice, it is necessary to truncate the impulse response at rank P to make it finite. To filter the signal, you must also divide the cutoff frequency by n. To generate the finite impulse response, we use the scipy.signal.firwin function, with Hann windowing to reduce ripples in the passband:
import scipy.signal P = 10 h = scipy.signal.firwin(numtaps=2*P+1,cutoff=[0.5/n],nyq=0.5,window='hann')
P is the truncation index of the impulse response, which must be increased to make the filter more selective.
Here is the result of the filtering:
y3 = scipy.signal.convolve(x3,h,mode='valid') t3 = numpy.arange(0,y3.size)*1.0/fe3 figure(figsize=(12,4)) plot(t3,y3) xlabel('t') axis([0,10,-1,1]) grid()

3.b. Digital interpolation filter
The previous technique, consisting of using an analog smoothing filter to reconstruct the signal, is difficult to implement, especially when the Nyquist frequency is just greater than fmax. As with the anti-aliasing filter, we come up against the difficulty of producing a very selective analog filter without distortion in the passband.
Another solution is to increase the sampling frequency so as to perform a digital smoothing, before the digital-to-analog conversion. The analog smoothing filter is then much simpler to produce because the Nyquist frequency is higher. The digital smoothing filter is called an interpolation filter; this is a low-pass filter whose cutoff frequency is half the sample rate before multiplication. The realization of a very selective digital low-pass filter does not pose any difficulty. This is exactly what we did in the previous example, where the sample rate was increased by a factor of 10 before applying digital low pass filtering.
The interpolation filter thus performs the convolution expressed by Shannon’s formula (4), convolution between the samples and a cardinal sine. In practice, the reconstruction is imperfect because the cardinal sine must be truncated to obtain a finite impulse response.
This technique is used in audio CD players, where the base frequency of 44 kHz is increased by a factor of 4 before applying the digital interpolation filter (22 kHz low pass). Since the new DAC frequency is 176kHz, the Nyquist frequency (88kHz) is much larger than fmax = 20kHz, which allows a simple first order filter to be used for analog smoothing. The following figure shows the block diagram of the complete chain:
