Speech/Music Discrimination

Question

Ömer Kaan Karaalp 2019년 12월 15일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/496704-speech-music-discrimination

편집: Brian Hemmat 2019년 12월 21일

Hello,

I have a general problem on discriminating speech and music signal on matlab. I need to create FFT/frequency graphs for each music and speech files and then create Spectral Centrum graph which must show average frequency difference between each that is how i discriminate them. I would appreciate if you help me.

댓글 수: 2
없음 표시없음 숨기기

Thiago Henrique Gomes Lobato 2019년 12월 15일

You need to be more specific in your question, what have you tried? How exactly you want to discriminate them, just looking at a graph? Sound discriminating is a very complex problem so you need to be more specific. If all that you want are the frequency bins take a look at the Spectogram function

Ömer Kaan Karaalp 2019년 12월 15일

I have 4 audio files which are 2 human voices and 2 musics. Firstly, I want to create 4 fft/frequency graph and then by using spectral centroid, i want to find out average frequencies. As a result, there must be 4 averages but there must also be specific interval between music and human voice signals. That is how i want to discriminate them but i do not know how to implement these on matlab. thanks for your attention sir.

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Brian Hemmat 2019년 12월 20일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/496704-speech-music-discrimination#answer_407250

편집: Brian Hemmat 2019년 12월 21일

MATLAB Online에서 열기

As a start, take a look at the spectralCentroid function in Audio Toolbox™.

This tutorial also covers some aspects of the area you're looking into (it has an example that uses spectralEntropy for music/speech discrimination):

https://www.mathworks.com/help/audio/ug/spectral-descriptors.html#SpectralDescriptorsExample-5

Below is some simple code to get you started. If you look at the histogram, you can see that a good threshold for the spectral centroid, based on this limited data, is around 330 Hz. That is, if the average spectral centroid for some audio is above 330 Hz, declare it as music. If it is below, declare it as speech. The code below requires Audio Toolbox™ R2019a or above to run.

desiredFs = 16e3;
[music1,fs] = audioread('RockGuitar-16-44p1-stereo-72secs.wav');
music1 = resample(mean(music1,2),desiredFs,fs);
[music2,fs] = audioread('handel.ogg');
music2 = resample(music2,desiredFs,fs);
[speech1,fs] = audioread('Rainbow-16-8-mono-114secs.wav');
speech1 = resample(speech1,desiredFs,fs);
[speech2,fs] = audioread('SpeechDFT-16-8-mono-5secs.wav');
speech2 = resample(speech2,desiredFs,fs);
music1_centroid = spectralCentroid(music1,fs);
music2_centroid = spectralCentroid(music2,fs);
speech1_centroid = spectralCentroid(speech1,fs);
speech2_centroid = spectralCentroid(speech2,fs);
music_centroid = [music1_centroid;music2_centroid];
speech_centroid = [speech1_centroid;speech2_centroid];
figure
h1 = histogram(music_centroid,'Normalization','probability');
hold on
h2 = histogram(speech_centroid,'Normalization','probability');
legend('Music','Speech')
xlabel('Centroid (Hz)')
ylabel('Probability')

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

Speech/Music Discrimination

댓글 수: 2
없음 표시없음 숨기기

답변 (1개)

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

제품

Community Treasure Hunt

Speech/Music Discrimination

댓글 수: 2 없음 표시없음 숨기기

답변 (1개)

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

제품

Community Treasure Hunt

댓글 수: 2
없음 표시없음 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기