Sunday, May 26, 2024
Derniers articles
Accueil > Unclassified > Introduction to spectral analysis
Unclassified

# Introduction to spectral analysis

## 1. Introduction

This document is an introduction to the spectral analysis of periodic signals. After having explained the decomposition of a periodic signal into a sum of sinusoidal functions, we will see how to perform the spectral analysis of a sampled signal.

## 2. Fourier series and spectrum of a periodic signal

We consider a periodic signal, represented by a function u of a real variable t with real values, of period T and of class C1 piecewise.

The fundamental frequency of the signal is: f1 = 1T (1)

According to the Fourier theorem, this function can be written as a sum of sinusoids whose frequencies are multiples of the fundamental frequency. The sum obtained is the Fourier series: u (t) = A02 + ∑n = 1PAncosn2πTt + φn (2)

In some cases, the sum can be stopped at a finite rank P. In other cases, we must in principle consider the limit P → ∞.

The term of order n is called the harmonic of order n of the signal: it is a sinusoid of frequency fn = nf1 = nT (3)

The harmonic of order n is defined by its amplitude An (positive) and its phase shift ϕn.

The constant term A0 / 2, which can be seen as the zero frequency term, is the mean value of the signal: A02 = 1T∫0Tu (t) dt (4)

Let us consider as an example a function whose Fourier series stops at rank P = 3. In this case, we say that the signal has three harmonics. By convention, the period is taken equal to 1:

```import numpy
import math

f1=1.0
def u(t):
return 0.4+1.0*numpy.cos(2*numpy.pi*f1*t)\
+0.5*numpy.cos(2*2*numpy.pi*f1*t-numpy.pi/3)\
+0.2*numpy.cos(3*2*numpy.pi*f1*t+numpy.pi/4)
```

To plot this signal, it is necessary to sample it, that is to say calculate the values of u (t) for instants regularly distributed over an interval and place them in an array. Here is a sample over two periods with 500 points:

```import numpy
N = 500
Tmax = 2.0
Te = Tmax/N
t = numpy.arange(N)*Te
x = u(t)
```

The curve of the signal is obtained with the plot function, which connects the points by segments of straight lines. If the sampling period Te is small compared to the period of the last harmonic (here that of order 3), a good graphic representation of the signal is thus obtained.

```from matplotlib.pyplot import *
figure(figsize=(10,4))
plot(t,x)
xlabel('t')
ylabel('u')
axis([0,2,-2,2])
grid()
```

The signal spectrum is the graphical representation of the amplitude An as a function of the frequency. A periodic signal theoretically has a discrete spectrum formed by lines, each corresponding to a harmonic. For this example, there are 4 lines: one for the average value (zero frequency) and 3 lines for the harmonics of order 1, 2 and 3:

```figure(figsize=(8,4))
stem([0,1,2,3],[0.8,1.0,0.5,0.2],'r')
xlabel('f')
ylabel('A')
axis([-1,5,0,2])
grid()
```

The zero frequency line has an amplitude equal to twice the mean value of the signal (the so-called continuous component).

The first curve (u as a function of t) is the temporal representation of the signal. The spectrum is the frequency representation of the signal. In principle, it would also be necessary to trace the phase ϕn to have a complete representation.

For calculations, it is convenient to introduce the Fourier series in complex form: u (t) = ∑n = -PPcnexpin2πTt (5)

The coefficient cn is a complex number called the Fourier coefficient. It is calculated from the function u with the following integral: cn = 1T∫0Tu (t) exp-in2πTt (6)

Since u (t) is real, the Fourier coefficient satisfies the following property: cn = c-n * (7)

where the star designates the conjugate complex. The sum is therefore written: u (t) = c0 + ∑n = 1Pcnexpin2πTt + c-nexp-in2πTt (8) = c0 + ∑n = 1P2Recnexpin2πTt (9)

By identification, we deduce: c0 = A02 (10) cn = An2eiφn (n> 0) (11)

## 3. Discrete Fourier transform

### 3.a. Definition

The discrete Fourier transform (DFT) is the transformation which makes it possible to calculate the spectrum of a discrete signal, obtained by sampling a continuous signal. Consider a sampling of the function u over the interval [0, T], comprising N points and defined by: tk = kTN (0≤k≤N-1) (12) uk = u (tk) (13)

We define the sampling period and the sampling frequency by: Te = TN (14) fe = 1Te (15)

An approximate value of the integral (6) defining the Fourier coefficients can be obtained by the method of rectangles: cn≃1T∑k = 0N-1TNukexp-i2πnkN (16)

By definition, the discrete Fourier transform ([1]) is the map which to the N numbers uk associates the following N complex numbers (0≤n≤N-1): Un˜ = 1N∑k = 0N-1ukexp-i2πnkN (17)

The DFT verifies the following property (if uk is real): U˜n = U˜N-n * (18)

The discrete Fourier transform is calculated numerically with the so-called fast Fourier transform algorithm. The function that performs this calculation (on an oscilloscope or in software) is often referred to as FFT (Fast Fourier Transform).

### 3.b. Example: trigonometric polynomial

If u is a trigonometric polynomial, there exists a finite rank P such that ∀n> P we have cn = 0. In other words, the Fourier series stops at a finite rank. There is therefore in the spectrum of the signal a maximum frequency: fmax = PT (19)

It is assumed that the Nyquist-Shannon condition is verified: the sampling frequency is greater than twice the maximum frequency of the spectrumfe> 2fmax (20)

or again: N> 2P (21)

If this condition is verified, then the first P + 1 values of the DFT are exactly the Fourier coefficients: cn = U˜npour0≤n≤P (22)

This property is surprising since the DFT is initially defined to give approximate values of the Fourier coefficients. Its demonstration is given below (paragraph 3.d).

To see how DFT can obtain the signal spectrum, let’s take the previous example. Since the period is equal to 1, the index n corresponds exactly to the frequency. The greatest frequency of the spectrum is P = 3 (harmonic of order 3). To meet the condition, it is therefore necessary to sample at least 6 points per period. Let’s see the result for N = 10. The period of the signal being known a priori, we can sample exactly 10 points over a period:

```N = 10
Tmax = 1.0
Te = Tmax/N
t = numpy.arange(0,N)*Te
x=u(t)
figure(figsize=(10,4))
stem(t,x)
xlabel('t')
ylabel('u')
grid()
```

Here is the calculation of the discrete Fourier transform:

```import numpy.fft
tfd = numpy.fft.fft(x)
```
```print(tfd)
--> array([  4.00000000e+00 +0.00000000e+00j,
5.00000000e+00 -2.10942375e-15j,
1.25000000e+00 -2.16506351e+00j,
7.07106781e-01 +7.07106781e-01j,
7.49400542e-16 +4.44089210e-16j,
1.77635684e-15 +2.22044605e-15j,
7.49400542e-16 -4.44089210e-16j,
7.07106781e-01 -7.07106781e-01j,
1.25000000e+00 +2.16506351e+00j,   5.00000000e+00 -2.60902411e-15j])```

The modulus is plotted as a function of the frequency. We must divide by N because the function numpy.fft.fft calculates the TFD without the factor 1 / N. Here the frequency is equal to the index n:

```figure(figsize=(10,5))
f=numpy.arange(0,N,1)
stem(f,numpy.absolute(tfd)/N)
xlabel("f")
ylabel("|cn|")
axis([-1,11,0,1])
grid()
```

The first 4 values ​​of the DFT indeed give the Fourier coefficients of rank 0 to 3. The elements of index 4,5 and 6 are zero, except for rounding errors (errors of the order of 1e-16). The last three elements (indices 7 to 9) are the conjugates of the three coefficients c1, c2 and c3; they therefore have the same module. The term index 1 is combined with the term index 9, that of index 2 combined with the term index 8, etc. The center frequency of the spectrum is equal to half of the sampling frequency; this is the Nyquist frequency. Here, the Nyquist frequency is fn = 5. As the Nyquist-Shannon condition is met, this frequency is greater than P = 3. The part of the spectrum to the left of this frequency is the spectrum of the continuous signal u (t). The entire spectrum is that of the discrete signal. The part to the right of the Nyquist frequency is the image of the left part (excluding the zero frequency component). We see that the Nyquist-Shannon condition results in the non-overlapping of the spectrum and its image.

In fact, the spectrum of the discrete signal is periodic, with a period equal to the sampling frequency. The relation defining the TFD indeed verifies the relation: U˜N + n = Un˜ (23)

The DFT that we calculated therefore gives the values of this spectrum over a period. However, what interests us in practice is rather the spectrum of the continuous signal u (t), which is given by the first half of the DFT.

In practice, we are interested in the amplitudes An of the harmonics rather than in the values of the Fourier coefficients cn. The coefficient must then be multiplied by two:

```figure(figsize=(10,5))
f=numpy.arange(0,N,1)
stem(f,numpy.absolute(tfd)*2/N)
xlabel("f")
ylabel("|cn|")
axis([-1,11,0,1.2])
grid()
```

Note that the zero frequency component of the spectrum has an amplitude equal to twice the mean value. In the present case, one could easily divide the term of index 0 by two to directly obtain the component of zero frequency of the signal, but this method is not always applicable on the experimental spectra.

### 3.c. Inverse discrete Fourier transform

The information present in the sampled signal is entirely contained in its DFT. We can indeed calculate the signal from its DFT by the following relation: uk = ∑n = 0N-1U˜nexpi2πnkN (24)

The transformation which thus makes it possible to find the discrete signal is the inverse discrete Fourier transformation. It looks a lot like direct DFT: we notice the change of sign in the exponential and the absence of the factor 1 / N. With regard to this factor, there are also different conventions for the definition of TFD.

As an example, let’s calculate the inverse DFT of the previously obtained DFT:

```y = numpy.fft.ifft(tfd)
figure(figsize=(10,4))
stem(t,y)
xlabel('t')
ylabel('y')
grid()
```

We find the discreet starting signal.

It is interesting to modify the TFD before calculating the inverse TFD. We can for example remove the third harmonic. For that, we must cancel the coefficient of index 3 and its image, of index N-3 = 7:

```i=3
tfd[i] = 0
tfd[N-i] = 0
y = numpy.fft.ifft(tfd)
figure(figsize=(10,4))
stem(t,y)
xlabel('t')
ylabel('y')
grid()
```

The result is the discrete signal which would be obtained by sampling the signal u (t) from which the third harmonic would have been removed. We therefore have here a very selective filtering method, called filtering in the frequency domain, or filtering by transform. by Fourier. It can be used when you need to filter a discrete signal whose size N is fixed in advance (and not too large). However, this method is more difficult to implement for real-time filtering, for which there is a continuous flow of data to be processed. In this case, a filtering is carried out in the time domain, also called filtering by convolution. For more on this, see the document Introduction to digital filters.

### 3.d. Relationship between DFT and Fourier coefficients

This paragraph gives the proof of the equality between the Fourier coefficients and the DFT.

We consider the Fourier series for the moment tk = kT / N: uk = ∑m = -∞ + ∞cmexpim2πTkTN = ∑m = -∞ + ∞cmexpi2πmkN (25)

We place ourselves for the moment in the general theoretical case, where the sum must be extended to infinity. Consider the integer division of the index m by N: m = qN + n (26)

q is the whole result of this division and n is the remainder. The sum over the index m can be written as a double sum, over n and over q. By grouping all the terms which have the same remainder, we obtain as follows: uk = ∑n = 0N-1∑q = -∞ + ∞cqN + nexpi2πnkN (27)

By identifying this expression with that of the inverse DFT, we show that: U˜n = ∑q = -∞∞cqN + n (28)

For the Fourier coefficient of rank n, the error of the approximation by U˜n is therefore: U˜n-cn = ∑q ≠ 0cqN + n (29)

We now consider the case where there exists a rank P such that for | k |> P we have ck = 0. We further assume that N> 2P (Nyquist-Shannon condition).

Let n be an integer such that 0≤n≤P. For q> 0, we have qN + n> 2qP + n> 2P + n> P therefore cqN + n = 0. Likewise for q <0 we have qN + n <2qP + n <-2P + n <-P therefore cqN + n = 0. All the terms of the sum are therefore zero: U˜n-cn = 0for0≤n≤P (30)

This demonstrates the equality between the first P + 1 terms of the DFT and the Fourier coefficients, provided that the Nyquist-Shannon condition is respected.

## 4. Spectral analysis of real signals

### 4.a. Analysis window

A real periodic signal has a period which, most often, is not known a priori. The objective of spectral analysis is precisely to determine the frequencies it contains. Moreover, its periodicity can be imperfect, even very rough. It can therefore be seen that the previous approach, which consisted in sampling the signal over its period T, is not applicable in practice.

Instead of trying to sample over a period, we consider a duration T greater than the assumed period. For example, for a sound of frequency 400 Hz, we can take T = 1s. We thus obtain a spectrum whose lines are the Fourier coefficients of a function of period T. The frequencies of these lines are therefore multiples of 1 / T. The duration T is the width of the analysis window. The frequency resolution of the spectrum obtained is 1 / T.

### 4.b. Example: sampled function

We take the signal already used by modifying its frequency:

```f1 = 1.0324
```

We choose an analysis time T = 20, which will give a frequency resolution of 0.05. The sampling frequency must respect the Nyquist-Shannon condition, i.e. fe> 7.

```T=20.0
fe=10.0
N=int(T*fe)
t = numpy.arange(0,N)*Te
x=u(t)
tfd=numpy.fft.fft(x)
f=numpy.arange(0,N)*1.0/T
figure(figsize=(10,4))
vlines(f,[0],numpy.absolute(tfd)/N,'b')
xlabel('f')
ylabel('C')
axis([-1,10,0,0.5])
grid()
```

The spectrum obtained is composed of very tight lines, spaced 1 / T apart. The spectrum of the function appears as the curve drawn by these lines. The proof of this property is outside the scope of this introduction, because it calls on the notion of Fourier transform.

We see in this example that the spectrum obtained is imperfect: the position of the maximum of the lines is best defined with a precision of 1 / T. In the present case, this induces a clearly visible error on the height of the lines. To improve accuracy, the duration of the analysis window must be increased. Let’s see the result with T = 10000:

```T=10000.0
fe=10.0
N=int(T*fe)
t = numpy.arange(0,N)*Te
x=u(t)
tfd=numpy.fft.fft(x)
f=numpy.arange(0,N)*1.0/T
figure(figsize=(10,4))
plot(f,numpy.absolute(tfd)/N,'b')
xlabel('f')
ylabel('C')
axis([-1,10,0,0.5])
grid()
```

The frequency resolution is much better. The expected amplitudes of the harmonics (according to the definition of the signal) are correct: c0 = 0.4, c1 = 0.5, c2 = 0.25, c3 = 0.1. Note: the plot function was used here and not vlines because the latter is not optimized for processing a large volume of data.

In certain cases, increasing the duration of the analysis window is not possible, or else is not sufficient to precisely obtain the height of the lines. We can then use the technique of filling by zeros associated with a progressive windowing, explained in Spectral analysis of a digital signal.

### 4.c. Example: sound analysis

Let us now see a more realistic example, the analysis of the sound emitted by a musical instrument. This is the sound emitted by a clarinet (A3 note) for a duration of about 3 seconds. Obviously, the width of the analysis window is limited by this duration.

The sound is recorded in a WAV file. We start by reading the file and extract the sampling frequency, the number of samples and the duration of the sound:

```import scipy.io.wavfile as wave
N = data.size
T = N*1.0/fe
data =data/data.max()
```
```print(fe)
--> 44100```
```print(T)
--> 2.8908843537414968```

To obtain a temporal representation of the signal (the waveform), it is necessary to limit oneself to a window comprising about ten oscillations, i.e. here about 0.02s:

```te = 1.0/fe
T = 0.02
N = int(T/te)
t = numpy.arange(N)*te
x = data[0:N]
figure(figsize=(10,4))
plot(t,x)
xlabel("t")
ylabel("A")
grid()
```

The waveform lets guess a spectrum very rich in harmonics.

To perform the spectral analysis, we choose a window of width T = 1Hz, which will give a precision of 1Hz on the frequency:

```T=1.0
N=int(T/te)
x = data[0:N]
tfd = numpy.fft.fft(x)
f = numpy.arange(N)*1.0/T
a = numpy.absolute(tfd)*2/N
figure(figsize=(10,4))
plot(f,a)
xlabel("f (Hz)")
ylabel("A")
grid()
```

We are interested in the spectrum of sound as a continuous signal, that is to say in the first half. Let’s see the details in the [0.5 kHz] band:

```figure(figsize=(10,4))
plot(f,a)
xlabel("f (Hz)")
ylabel("A")
grid()
axis([0,5000,0,a.max()])
```

This spectrum shows the fundamental (at 440Hz) and harmonics of order 2 to 8. The harmonics of high order (for example 7) have a very high frequency (for the ear). Although their amplitude is relatively small, they contribute to the timbre of the sound.