i want to use LSTM based audio network to work with Live audio

조회 수: 2 (최근 30일)
Arslan Munim
Arslan Munim 2022년 7월 27일
댓글: Arslan Munim 2022년 9월 28일
Hello Matlab team,
I am using this example to work with my audio data set https://www.mathworks.com/matlabcentral/fileexchange/74611-fault-detection-using-deep-learning-classification#examples_tab dataset is trained but I want to make the application live with PC, forexample I have a mic and make an application to use my trained model to predict the output.
Can you guide me or help me with that?
Regards,
Arslan Munaim

답변 (2개)

jibrahim
jibrahim 2022년 7월 27일
Hi Arslan,
There is a function in that repo (streamingClassifier) that should get the job done in conjunction with an audio device reader:
% Create a microphone object
adr = audioDeviceReader(SampleRate=16e3,SamplesPerFrame=512);
% These statistic value should come from your training...
M = 0;
S = 1;
while 1
% Read a frame of data from microphone
frame = adr();
% Pass to network
scores = streamingClassifier(frame,M,S);
% Use the scores any way you want
end
  댓글 수: 5
jibrahim
jibrahim 2022년 8월 2일
Hi Arslan,
Since you trained the network with a sample rate of 16e3, you will have to perform sample-rate conversion from 44100 kHz to 16 kHz. This code is a possible implementation, where you essentially feed the network frames of length 512 sampled at 16 kHz, just like the original code:
% Create a microphone object
%adr = audioDeviceReader(SampleRate=16e3,SamplesPerFrame=512);
src = dsp.SampleRateConverter(InputSampleRate=44100,OutputSampleRate=16e3,...
Bandwidth=15800);
[~,D] = src.getRateChangeFactors;
% The frame size must be a multiple of 441 (the decimation factor of the
% sample rate converter)
L = floor(22000/D);
frameLength = L*D; % get as close to desired frame size
adr = audioDeviceReader(SampleRate=44100,SamplesPerFrame=frameLength);
buff = dsp.AsyncBuffer;
% These statistic values should come from your training...
M = 0;
S = 1;
while 1
% Read a frame of data from microphone
frame = adr();
% Convert to 16 KHz
frame = src(frame);
% Save to buffer
write(buff,frame)
while buff.NumUnreadSamples >= 512
frame = read(buff,512);
% Pass to network
scores = streamingClassifier(frame,M,S);
% Use the scores any way you want
end
end
Note that you can also potentially feed the network longer frames. That should also work, and is probably more efficient, as the network will run faster if you give it a long input (as opposed to multiple short ones):
% Create a microphone object
%adr = audioDeviceReader(SampleRate=16e3,SamplesPerFrame=512);
src = dsp.SampleRateConverter(InputSampleRate=44100,OutputSampleRate=16e3,Bandwidth=15800);
[~,D] = src.getRateChangeFactors;
% The frame size must be a multiple of 441 (the decimation factor of the
% sample rate converter)
L = floor(22000/D);
frameLength = L*D;
adr = audioDeviceReader(SampleRate=44100,SamplesPerFrame=frameLength);
buff = dsp.AsyncBuffer;
% These statistic values should come from your training...
M = 0;
S = 1;
while 1
% Read a frame of data from microphone
frame = adr();
% Convert to 16 KHz
frame = src(frame);
% Save to buffer
write(buff,frame)
N = buff.NumUnreadSamples;
L = floor(N/512);
if L>0
frame = read(buff,512*L);
% Pass to network
scores = streamingClassifier(frame,M,S);
% Use the scores any way you want
end
end
If you can't change the frame size on the microphone, then you can handle that using another buffer, for example:
% Create a microphone object
%adr = audioDeviceReader(SampleRate=16e3,SamplesPerFrame=512);
src = dsp.SampleRateConverter(InputSampleRate=44100,OutputSampleRate=16e3,Bandwidth=15800);
[~,D] = src.getRateChangeFactors;
% The frame size must be a multiple of 441 (the decimation factor of the
% sample rate converter)
L = floor(22000/D);
frameLength = L*D;
adr = audioDeviceReader(SampleRate=44100,SamplesPerFrame=22000);
buffSRC = dsp.AsyncBuffer;
buff = dsp.AsyncBuffer;
% These statistic values should come from your training...
M = 0;
S = 1;
while 1
% Read a frame of data from microphone
frame = adr();
write(buffSRC,frame);
frame = read(buffSRC,frameLength);
% Convert to 16 KHz
frame = src(frame);
% Save to buffer
write(buff,frame)
N = buff.NumUnreadSamples;
L = floor(N/512);
if L>0
frame = read(buff,512*L);
% Pass to network
scores = streamingClassifier(frame,M,S);
% Use the scores any way you want
end
end
Arslan Munim
Arslan Munim 2022년 8월 9일
Thankyou for your support, it was very helpful.
Now I want to use multiple mics for prediction can you please give me some idea how i can use streaming classifier with 3 or 4 mics of the predicition.
Thanks and have a nice day.
Regards,
Arslan

댓글을 달려면 로그인하십시오.


jibrahim
jibrahim 2022년 8월 9일
Hi Arslan,
audioDeviceReader supports multi-mic devices. Use the ChannelMappingSource and ChannelMapping properties to map between device input channels and the output data.
This network was trained on mono data, so, to adapt it to multi-channel data, you either have to retrain your network for multi-channel data, or somehow combine your input channels into one channel (by a weighted sum, or selecting a particular channel, etc) and proceed like above.
  댓글 수: 23
jibrahim
jibrahim 2022년 8월 20일
OK, this helps. You will need other hardware (one device, multiple mics) for the system to recognize it. You could also give the UDP idea a shot, see how viable that is.
Arslan Munim
Arslan Munim 2022년 9월 28일
Hi again,
I am trying to train my network, with lowering BitsPerSample to 8 before it was 16 BitsPerSample. Every time i try to start training model it throw warning (given below) and terminates.
I try it with different sample rate but it gives same error everytime. I tried to change my layer structure, changing InitialLearnRate',0.001 but still i am getting same warning.
Warning: Training stopped at iteration 1 because training loss is NaN. Predictions using the output network might contain NaN values.
Model:
layers = [ ...
sequenceInputLayer(size(trainingFeatures{1},1))
lstmLayer(100,"OutputMode","sequence")
dropoutLayer(0.1)
lstmLayer(100,"OutputMode","last")
fullyConnectedLayer(5)
softmaxLayer
classificationLayer];
miniBatchSize = 30;
validationFrequency = floor(numel(trainingFeatures)/miniBatchSize);
options = trainingOptions("adam", ...
"MaxEpochs",100, ...
"MiniBatchSize",miniBatchSize, ...
"Plots","training-progress", ...
"Verbose",false, ...
"Shuffle","every-epoch", ...
"LearnRateSchedule","piecewise", ...
"LearnRateDropFactor",0.1, ...
"LearnRateDropPeriod",20,...
'InitialLearnRate',0.001,...
'ValidationData',{validationFeatures,adsValidation.Labels}, ...
'ValidationFrequency',validationFrequency);
Regards,
Arslan

댓글을 달려면 로그인하십시오.

제품


릴리스

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by