Creating dataset of different size audio samples

Question

0 개 추천

I want to train a Classification Neural Network to do Speech Emotion Recognition the dataset have audio samples with the emotion to each sample, I did the following:

resample the audio to 16khz
discrete wavelet transformation 'db8'

The problem the result of dwt have different length vectors to different length audio to put it into one input matrix to train the network, the padding wont work because the matrix size would be very large

any suggestions how to make all vectors of the same length or using another filter instead of dwt.

thanks in advance

댓글 수: 1
이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

boudy 2015년 11월 29일

Thanks a lot.

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Follow Question

Answer 1

Walter Roberson 2015년 11월 27일

1 개 추천

Using a different filter is not going to make any difference. Filters preserve proportional length; longer audio results in longer filtered output. You will either need to use a different strategy or you will need to pad or truncate your vectors to all be the same length.

You might want to consider using something similar to an fft transform to a fixed number of points.

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

Creating dataset of different size audio samples

댓글 수: 1
이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

채택된 답변

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

추가 답변 (0개)

카테고리

제품

태그

Community Treasure Hunt

Creating dataset of different size audio samples

댓글 수: 1 이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

채택된 답변

댓글 수: 0 이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

추가 답변 (0개)

카테고리

제품

태그

참고 항목

Community Treasure Hunt

댓글 수: 1
이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기