Need help with word identification using fft()

Question

Wynand 2023년 10월 23일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2037641-need-help-with-word-identification-using-fft

답변: Shivansh 2023년 11월 15일

I am writing a program to listen to a voice and then return the most probable matching word. The issue I am running into is the fft() returns such similar frequency bands that comparisons are wildly similar and as such, inaccurate. I have attached a function I wrote for this goal. Any hints to get me more accurate analysis would be greatly appreciated

My database is 10 words with 10 versions each in a 10x10 cell matrix. From comparison I get get total difference between all the database words and the recorded word and the lowest value difference is assumed to be the closest match

FFT_Output was the first iteration

I tried to rewrite it more cleanly with FFT_Output_2 but I think it's mostly, functionally the same but easier to digest.

Thank you

댓글 수: 2
없음 표시없음 숨기기

Star Strider 2023년 10월 24일

You would need to use either spectrogram or pspectrum (with the 'spectrogram' option), and then use relatively sophisticated pattern-matching. See Formant Estimation with LPC Coefficients for a representative approach. Using fft alone is not going to be sufficient.

Wynand 2023년 10월 24일

Thank you for the reply. Unfortunately our objective was to use fft and not other built in functions of matlab

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Shivansh 2023년 11월 15일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2037641-need-help-with-word-identification-using-fft#answer_1353037

Hi Wynand,

I understand that you are using FFT to transform the voice signal into the frequency domain and then finding the peaks to differentiate between words. The issue with this approach is that it focuses mainly on the frequency content of the signal, which may not be sufficient to differentiate between different words pronounced by different people with different voice characteristics.

You can try “mfcc()” or some machine learning techniques as they can provide significantly better results.

Since you want to experiment with fft(), you can still improve your results by refining your FFT-based approach. You can work on following suggestions:

You can try using a larger number of bins and base them on any other scale instead dividing the FFT output into bins of equal width.
You are currently using the “findpeaks” function with fixed parameters which might not be optimal for every word. You can consider using a more sophisticated peak detection algorithm for audio signals.
You are currently only using the magnitude of the FFT output. You can try including phase outputs for distinguishing between different words.
You can also experiment with different windows other than Blackman window.
You might want to use overlapping frames instead of dividing your signal into non-overlapping frames. This can provide a smoother representation of the signal and might improve your results.

You can refer to the following link for more information regarding “fft()” https://in.mathworks.com/help/matlab/ref/fft.html.

Hope it helps!

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

Need help with word identification using fft()

댓글 수: 2
없음 표시없음 숨기기

답변 (1개)

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

Need help with word identification using fft()

댓글 수: 2 없음 표시없음 숨기기

답변 (1개)

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

댓글 수: 2
없음 표시없음 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기