Audio to Mel Spectrogram

조회 수: 12 (최근 30일)

이전 댓글 표시

Mudasser Ahmad 2023년 9월 22일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2024367-audio-to-mel-spectrogram

댓글: Mudasser Ahmad 2023년 9월 22일

Hello I am working on sound classification problem. my task is to create mel spectrogram with three different windows length 93ms and 46ms and 23ms this is achieved by keeping n_fft to 2048,1024 and 512 respectively. I am getting (128,216) but I don't understand the 3 there (128,216,3) here 128 is number of frequency bins and 216 are number of frames. Can some help me understand the right side the attached image the DL part?

댓글 수: 2
없음 표시없음 숨기기

Mathieu NOE 2023년 9월 22일

You have 3 time windows , so you are omputing 3 spectrograms, each one is an array size 128 x 216

at the end your 3 spectrograms are stored in a 3D array, size 128 x 216 x 3

Mudasser Ahmad 2023년 9월 22일

Thanks for your feedback.

is my code doing correctly? this is what the image says?

import librosa

import numpy as np

# Load the audio file

y, sr = librosa.load(r'G:\A NEW RESEARCH DATASET\1Fire\2_Fire.wav') # Replace 'path_to_your_audio_file.wav' with your audio file path

# List of n_fft values

n_ffts = [2048, 1024, 512]

# List to hold spectrograms

spectrograms = []

#Generate spectrograms for each n_fft value

for n_fft in n_ffts:

mel_spec = librosa.feature.melspectrogram(y=y, sr=sr, n_fft=n_fft, hop_length=512, n_mels=128)

spectrograms.append(mel_spec)

# Stack the spectrograms along the third dimension

tensor = np.stack(spectrograms, axis=-1)

print(tensor.shape) # This should print (90, time_steps, 3), where time_steps depends on the length of your audio file

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

답변 (0개)

이 질문에 답변하려면 로그인하십시오.

카테고리

Signal Processing Signal Processing Toolbox Time-Frequency Analysis

Help Center 및 File Exchange에서 Time-Frequency Analysis에 대해 자세히 알아보기

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Translated by