Welcome to Intelligent Additive Manufacturing Lab

We're aiming to create a knowledge hub for 3D printing of the future.

On this page, a MATLAB program made by Gabor Transform is used to analyze two rock and roll songs. We use Gabor filtering to finds which individual notes are playing from played songs.

Introduction

Fourier transforms is a great method to determine which frequencies are present in a signal. However, it can not tell us where those frequencies occur in time. So, based on the Fourier transform, the Gabor transform was proposed in order to extract the local information from the Fourier transform of the signal. In this condition, a good application for the Gabor transform is to identify an instrument from a song. In this page, we will use Gabor transform to reproduce the music score for the guitar in Sweet Chile O’ Mine and the bass in Comfortably Numb. Also, we will isolate the bass in Comfortably Numb. After that, we will try to isolate the guitar solo in Comfortably Numb.

Also, Music scale along with the frequency of each note in Hz is following

Figure 1: Music scale along with the frequency of each note in Hz

Theoretical Background

The foundation for this time-frequency analysis is the Fourier Transform (FT). It can represent any function in the interval (-L, L]. As a result, we can get which frequencies in the song. It is defined as follows:

(1)   \begin{equation*} f\left( x\right) =\sum_{n=-\infty}^{\infty }  {C_n e^{\frac{in\pi }{L} x}} \qquad x\in \left( -L,L\right]\end{equation*}

However, if we want to know when those frequencies happen, we need Gabor Transform also know as Short-Time Fourier Transform (STFT) to filter our songs. It can be written as:

(2)   \begin{equation*} \widetilde{f_{g}} \left( t,\omega \right) =\int^{\infty }_{-\infty } {f\left( \tau \right) g\left( t-\tau \right) e^{-i\omega \tau }d\tau\end{equation*}

Note g(t-\tau) is the sliding time filter. Multiply the filter to f(\tau) to localize the signal at a specific window in time. In this condition, our signal will be centered at t=\tau. By sliding the filter over the signal based on the time domain, we can determine the sequence of notes in our songs.

About the STFT, there are two keys. The first one we need to determine the step interval. In the second one, we need to determine the wide window size of the filter. A fine slide will provide a good time resolution, but the lower frequencies will be lost. On the other hand, a wide window size will pick up more frequency information but has a lower precision.

In addition, in order to clean out noise to get a good clean music score. A Gaussian filter is implemented

(3)   \begin{equation*}     G_{f}\left( t\right) =e^{{-\tau \left( t-t_{0}\right) }^{2} }\end{equation*}

Result

The following two figures show the parameter tuning result for Sweet Chile O’ Mine under different step intervals and window sizes of Gabor Transform.

Figure 2: Sweet Child O’ Mine (window size =10)
Figure 3: Sweet Child O’ Mine (window size =50)

Code and Raw Data

clear all; close all; clc
figure(1)
[y, Fs] = audioread('GNR.m4a');
L = length(y)/Fs;
n = length(y);
t = (1:n)/Fs;
k = (2*pi/L) * [0:(n/2)-1 -n/2:-1]; ks = fftshift(k);
S = y.';
% set step interval and window size
tau = 50;    %window size try it with 10 50 500 1000 respectively
tslide = 0:0.1:t(n); %step interval with 0.1, 0.01, 0.001, 0.0001
spectro = [];
ksf = ks/(2*pi);
freq = zeros(1,length(tslide));
for j = 1 : length(tslide)
filter = exp(-tau*((t-tslide(j)).^2));
yf = S.* filter;
yft = fft(yf);
spectro = [spectro; abs(fftshift(yft))/max(abs(yft))];
end

pcolor(tslide, ksf(n/2+3000:n), spectro(:,n/2+3000:n).'), shading interp
set(gca, 'Ylim',[218 5000/(2*pi)])
title('Sweet Child O Mine Spectrogram')
xlabel('Time [sec]'); ylabel('Frequency [Hz]')
colormap(hot)

%%Comfortably Numb Bass
clear all; close all; clc
figure(1)
[y, Fs] = audioread('Floyd.m4a');
S=y.';
L = length(y)/Fs; 
n = length(y);
t = (1:n)/Fs;
k = (2*pi/L)*[0:(n/2)-1 -n/2:-1]; k(n)=0; ks = fftshift(k);
tau = 1;     
tau2 = .001; %bass filter width
b = 150;     %bass filter center
spectro = [];
ksf = ks/(2*pi); %set frequencies
%bass filter
bfilter = exp(-tau2*((ksf-b).^2)); 
tslide = linspace(0,t(n),60);
freq = zeros(1,length(tslide));
for j = 1 : length(tslide)
    filter = exp(-tau*((t-tslide(j)).^2));
    yf = S.* filter;
    yft = bfilter .* fftshift(fft(yf)); 
    spectro = [spectro; abs(yft)/max(abs(yft))];
end
figure(2)
pcolor(tslide, ksf((n-1)/2 : 1327500), spectro(:, (n-1)/2 : 1327500).'), shading interp
set(gca, 'Ylim',[50 150])
title('Comfortably Numb Bass Spectrogram')
xlabel('Time [sec]'); ylabel('Frequency [Hz]');
colormap(hot)
%% clean noise
close all; clear all; clc;
[y2,Fs] = audioread('Floyd.m4a');
y = y2(1:end-1); 
S = y.';
L = length(y)/Fs; 
n = length(y);
t2 = linspace(0,L,n+1); 
t = t2(1:n);
k=(2*pi/L)*[0:n/2-1 -n/2:-1]; k(n)=0; ks=fftshift(k);
ksf = ks/(2*pi); %gives frequencies

tau=50; tslide=linspace(0,t(end),100);
spectro=zeros(length(tslide),n);
for j=1:length(tslide)
    g=exp(-tau*(t-tslide(j)).^2);
    ft_spec=fft(g.*S);
    abs_ft_spec=abs(fftshift(ft_spec));
    [M,I]=max(abs_ft_spec(n/2:end));
    [I1,I2]=ind2sub(size(abs_ft_spec),I+n/2-1);
    g2=exp(-0.5*((ks-ks(I2)).^2));
    Sgts2=fftshift(ft_spec).*g2;
    spectro(j,:)=abs(Sgts2);
end
figure(1)
pcolor(tslide, ksf((n-1)/2 : 1327500), spectro(:, (n-1)/2 : 1327500).'), shading interp
set(gca, 'Ylim',[50 150])
title('Comfortably Numb Bass Spectrogram')
xlabel('Time [sec]'); ylabel('Frequency [Hz]');
colormap(hot)