## 1. Introduction

Sampling is a common operation not only in analog-to-digital conversion, but also in any digital calculation consisting in generating discrete values from a continuous function (sampling of functions, synthesis of images, etc.).

We will see the need for a sufficient sampling frequency to avoid the phenomenon of aliasing. The principle of stochastic sampling is also presented.

import math import numpy from matplotlib.pyplot import *

## 2. periodically sampling

Let be a function with a real variable u (t). The variable t represents time or any other variable, for example a space variable. Periodic sampling consists of fixing a sampling period Te and obtaining a sequence of numbers defined by: uk = u (kTe) (1)

where the index k is integer. The sampling frequency is fe = 1 / Te.

In some cases (real-time electronic systems), the index k is unbounded. In other cases, a finite number N of samples is taken and the index k varies from 0 to N-1. We will consider this case. The interval over which the function is sampled is [0, T].

The function that will serve as an example is as follows:

a = 20000 def u(t): return math.cos(2*math.pi*a*(1-t*t))

that we sample over the interval [0,0.1]:

T=0.1 N=2000 Te=T/N fe=1.0/Te t = numpy.zeros(N) echant = numpy.zeros(N) for k in range(N): t[k] = k*Te echant[k] = u(t[k])

To represent the samples graphically, we can connect the points with line segments, which we usually do when plotting a function:

figure(figsize=(12,4)) plot(t,echant) xlabel('t') ylabel('u') grid()

Let’s also see a detail of the area on the right:

figure(figsize=(12,4)) plot(t,echant) grid() axis([0.08,0.1,-1,1])

This function provides both slow variations and rapid changes. By changing the parameter, one can increase or reduce the presence of rapid changes. The discrete Fourier transform provides the spectrum of these samples. The resolution of the spectrum is the inverse of the time T.

from numpy.fft import fft tfd = fft(echant) spectre = numpy.absolute(tfd) freq = numpy.zeros(N) for k in range(N): freq[k] = k*1.0/T figure(figsize=(12,4)) plot(freq,spectre) axis([0,fe,spectre.min(),spectre.max()]) xlabel('f') ylabel('A') grid()

The spectrum extends from the zero frequency to the sampling frequency fe. Half of the sample rate is called the Nyquist frequency, fn = fe / 2. It is noted that the spectrum is symmetrical with respect to this frequency. The right part is the image of the spectrum of the continuous function.

According to Shannon’s theorem, samples retain the information present in the original signal if the sampling frequency is greater than twice the largest frequency present in the signal spectrum. Denote by fmax this greater frequency. In the present case, we can estimate it at around 6000. The Nyquist-Shannon condition is written fn> fmax. In this example, this condition seems to be true, which means that the sampling frequency is high enough. One notes nevertheless on the plotting of the samples in the interval [0.08,0.1] that the shape of the oscillations is not restored perfectly (although the period is correct).

We speak of sub-sampling when the Nyquist-Shannon criterion is not satisfied. We are going to place ourselves in this case by reducing the sampling frequency:

N=300 Te=T/N fe=1.0/Te t = numpy.zeros(N) echant = numpy.zeros(N) for k in range(N): t[k] = k*Te echant[k] = u(t[k]) figure(figsize=(12,4)) plot(t,echant) xlabel('t') ylabel('u') grid()

The original function is not output correctly in high frequency areas. We also notice the presence of a low-frequency variation (a little before 0.08) which is not at all present in the original signal. The most annoying consequence of downsampling is the appearance of low frequencies in the sampled signal. The English word for this phenomenon is aliasing (from the Latin alias). In some cases like this, a replica of the low-frequency signal is indeed observed.

Let’s see the spectrum of samples:

tfd = fft(echant) spectre = numpy.absolute(tfd) freq = numpy.zeros(N) for k in range(N): freq[k] = k*1.0/T figure(figsize=(12,4)) plot(freq,spectre) axis([0,fe,spectre.min(),spectre.max()]) xlabel('f') ylabel('A') grid()

The two symmetrical parts of the spectrum have mingled. This is obviously a consequence of not respecting the condition fn> fmax. The result is a completely altered spectrum. Some parts of the image spectrum have folded back onto the spectrum. This phenomenon is called band aliasing. It occurs in the event of downsampling. Band folding and aliasing are two manifestations (one frequency and the other temporal) of the same phenomenon.

To avoid band aliasing, the sampling frequency must be increased to a sufficient level to meet the condition fe> 2fmax. In the field of the digitization of time signals (for example the digitization of sound), this condition is systematically fulfilled. In this case, a very significant oversampling is even carried out at the time of digitization, because today’s electronics allow it without any problem. In other areas, such as digitizing images, over-sampling is not always possible. For example, a digital camera sensor has a sampling rate set by the number of pixels. It is then necessary to carry out an analog low-pass filtering before carrying out the digitization (anti-aliasing filter).

In the purely digital domain (sampling of functions, image synthesis), anti-aliasing filtering is not possible. We will see other sampling techniques that can reduce the effects of aliasing.

## 3. Stochastic sampling

Stochastic sampling, or Monte-Carlo sampling, can be used in the field of signal synthesis, for example ray tracing image synthesis. It consists of choosing the points to be sampled more or less randomly.

We can start by choosing the N points to sample randomly over the interval [0, T], with a uniform probability density. It is of course necessary to order the samples by increasing t before plotting them.

from random import random N=300 table = numpy.zeros((N,2)) for k in range(N): table[k][0] = random()*T table[k][1] = u(table[k][0]) def key(elem): return elem[0] table = sorted(table,key=key) table=numpy.transpose(table) t = table[0] echant = table[1] figure(figsize=(12,4)) plot(t,echant) xlabel('t') ylabel('u') grid()

The phenomenon of aliasing has disappeared. However, this method is unsatisfactory because the samples may concentrate in certain places or, on the contrary, be lacking in other places. To remedy this, we carry out stratified stochastic sampling. It is a matter of dividing the interval [0, T] into N equal parts (as for periodic sampling), and of placing the instant t randomly in each part. However, we keep a periodic distribution of times in the final table.

N=300 t = numpy.zeros(N) echant = numpy.zeros(N) Te=T/N for k in range(N): t[k] = k*Te echant[k] = u(t[k]+Te*0.5*(random())) figure(figsize=(12,4)) plot(t,echant) xlabel('t') ylabel('u') grid()

This method makes it possible to correctly restore the frequency bases of the signal, while greatly reducing the phenomenon of aliasing. Of course, it is better, if possible, to increase the sampling frequency.

A variant, used in image synthesis, consists in choosing several samples randomly over each interval of length Te, and in retaining the arithmetic mean of the samples.

N=300 t = numpy.zeros(N) echant = numpy.zeros(N) Te=T/N ne = 100 for k in range(N): t[k] = k*Te moy = 0.0 for i in range(ne): moy += u(t[k]+Te*0.5*(random())) moy /= ne echant[k] = moy figure(figsize=(12,4)) plot(t,echant) xlabel('t') ylabel('u') grid()

This averaging brings back the phenomenon of aliasing. Rather, this method is used to reduce the effect of aliasing near sharp edges in an image.

## 4. Oversampling

The best method, when possible, is over-sampling. After very high frequency sampling has been completed, it may be necessary to reduce the number of samples to limit the amount of data to be stored. For example in the case of an image, the final resolution of the image may be much lower than the sampling resolution.

It is therefore necessary to establish a method to reduce the number of samples, without introducing again the phenomenon of aliasing. This can be achieved by first applying low-pass filtering to the samples before reducing their number.

The simplest low pass filter is the averaging filter. Suppose that we sample 10N points and that we want to keep N points. For each group of 10 consecutive points, we keep the average value.

N=300 Te=T/N n = 10 dT = Te/n t=numpy.zeros(N) echant=numpy.zeros(N) for k in range(N): t[k] = k*Te moy = 0.0 for i in range(n): moy += u(t[k]+i*dT) moy /= n echant[k] = moy figure(figsize=(12,4)) plot(t,echant) xlabel('t') ylabel('u') grid()

The result is a signal in which the high frequencies have been attenuated (by filtering). Aliasing is present but it is considerably reduced. The averaging filter is not the best low pass filter. It is best to use a Gaussian low-pass filter, as explained in Introduction to digital filters.