Speech/Music Discrimination
조회 수: 9 (최근 30일)
이전 댓글 표시
Hello,
I have a general problem on discriminating speech and music signal on matlab. I need to create FFT/frequency graphs for each music and speech files and then create Spectral Centrum graph which must show average frequency difference between each that is how i discriminate them. I would appreciate if you help me.
댓글 수: 2
Thiago Henrique Gomes Lobato
2019년 12월 15일
You need to be more specific in your question, what have you tried? How exactly you want to discriminate them, just looking at a graph? Sound discriminating is a very complex problem so you need to be more specific. If all that you want are the frequency bins take a look at the Spectogram function
답변 (1개)
Brian Hemmat
2019년 12월 20일
편집: Brian Hemmat
2019년 12월 21일
This tutorial also covers some aspects of the area you're looking into (it has an example that uses spectralEntropy for music/speech discrimination):
Below is some simple code to get you started. If you look at the histogram, you can see that a good threshold for the spectral centroid, based on this limited data, is around 330 Hz. That is, if the average spectral centroid for some audio is above 330 Hz, declare it as music. If it is below, declare it as speech. The code below requires Audio Toolbox™ R2019a or above to run.
desiredFs = 16e3;
[music1,fs] = audioread('RockGuitar-16-44p1-stereo-72secs.wav');
music1 = resample(mean(music1,2),desiredFs,fs);
[music2,fs] = audioread('handel.ogg');
music2 = resample(music2,desiredFs,fs);
[speech1,fs] = audioread('Rainbow-16-8-mono-114secs.wav');
speech1 = resample(speech1,desiredFs,fs);
[speech2,fs] = audioread('SpeechDFT-16-8-mono-5secs.wav');
speech2 = resample(speech2,desiredFs,fs);
music1_centroid = spectralCentroid(music1,fs);
music2_centroid = spectralCentroid(music2,fs);
speech1_centroid = spectralCentroid(speech1,fs);
speech2_centroid = spectralCentroid(speech2,fs);
music_centroid = [music1_centroid;music2_centroid];
speech_centroid = [speech1_centroid;speech2_centroid];
figure
h1 = histogram(music_centroid,'Normalization','probability');
hold on
h2 = histogram(speech_centroid,'Normalization','probability');
legend('Music','Speech')
xlabel('Centroid (Hz)')
ylabel('Probability')
댓글 수: 0
참고 항목
카테고리
Help Center 및 File Exchange에서 Audio and Video Data에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!