i want to use LSTM based audio network to work with Live audio
조회 수: 4(최근 30일)
표시 이전 댓글
Hello Matlab team,
I am using this example to work with my audio data set https://www.mathworks.com/matlabcentral/fileexchange/74611-fault-detection-using-deep-learning-classification#examples_tab dataset is trained but I want to make the application live with PC, forexample I have a mic and make an application to use my trained model to predict the output.
Can you guide me or help me with that?
Regards,
Arslan Munaim
답변(2개)
jibrahim
2022년 7월 27일
Hi Arslan,
There is a function in that repo (streamingClassifier) that should get the job done in conjunction with an audio device reader:
% Create a microphone object
adr = audioDeviceReader(SampleRate=16e3,SamplesPerFrame=512);
% These statistic value should come from your training...
M = 0;
S = 1;
while 1
% Read a frame of data from microphone
frame = adr();
% Pass to network
scores = streamingClassifier(frame,M,S);
% Use the scores any way you want
end
댓글 수: 5
Arslan Munim
2022년 7월 28일
편집: Arslan Munim
2022년 7월 28일
Thanks for your reply, I tried using streamingClassifier. however I am trying to use extract function instead of extractFeatures function (because of dependenices issues) however with extract function I can only use one feature at a time. however I trained network with 11 features.
Can you please how i can use extract function in streamingClassifier? I am attaching code for your reference:
windowLength = 512;
overlapLength = 0;
aFE = audioFeatureExtractor('SampleRate',44100, ...
'Window',hamming(windowLength,'periodic'),...
'OverlapLength',overlapLength,...
'spectralCentroid',true, ...
'spectralCrest',true,...
'spectralDecrease',true, ...
'spectralEntropy',true,...
'spectralFlatness',true,...
'spectralFlux',true,...
'spectralKurtosis',true,...
'spectralRolloffPoint',true,...
'spectralSkewness',true,...
'spectralSlope',true,...
'spectralSpread',true);
features = extract(aFE , audioIn)
%%%%%%%%%features = extractFeatures(audioIn);
% Normalize
features = ((features - M')./S');
[net, scores] = predictAndUpdateState(net,features);
jibrahim
2022년 7월 28일
Hi Arslan,
The extract function should also return 11 features. For example, if you replace the eixsting function extractFeatures with this modified function, things should work the same:
function featureVector = extractFeatures2(x)
%#codegen
persistent afe
if isempty(afe)
windowLength = 512;
overlapLength = 0;
afe = audioFeatureExtractor('SampleRate',44100, ...
'Window',hamming(windowLength,'periodic'),...
'OverlapLength',overlapLength,...
'spectralCentroid',true, ...
'spectralCrest',true,...
'spectralDecrease',true, ...
'spectralEntropy',true,...
'spectralFlatness',true,...
'spectralFlux',true,...
'spectralKurtosis',true,...
'spectralRolloffPoint',true,...
'spectralSkewness',true,...
'spectralSlope',true,...
'spectralSpread',true);
end
featureVector = extract(afe,x);
end
The size of featureVector will be 1-by-11, each element in the vector representing one of your features.
Notice I declared afe as persistent. This is to ensure the audio feature extractor is not recreated every time you call this function in your loop. the extractor goes through some one-time setup computations when you first call it. No need to waste time repeating those.
Arslan Munim
2022년 8월 2일
I want to use samplerate of 44100Hz and samplesPerFrame of 22000, the code work fine with SampleRate=16e3,SamplesPerFrame=512 but when i increase sampleRate and samplesPerFrame size of featureVector increases and it always shows arrays incompatible sizes error:
Arrays have incompatible sizes for this operation.
Error in streamingClassifier (line 15)
feature = ((features - M')./S');
Error in scMatlab (line 9)
scores = streamingClassifier(frame,M,S);
Size of feature vector generated with sampleRate=44100 and samplesPerFrame=22000 is always x by 11.
Can you please tell me how i can reduce it to, just 1x11.
Thanks,
Arslan
jibrahim
2022년 8월 2일
Hi Arslan,
Since you trained the network with a sample rate of 16e3, you will have to perform sample-rate conversion from 44100 kHz to 16 kHz. This code is a possible implementation, where you essentially feed the network frames of length 512 sampled at 16 kHz, just like the original code:
% Create a microphone object
%adr = audioDeviceReader(SampleRate=16e3,SamplesPerFrame=512);
src = dsp.SampleRateConverter(InputSampleRate=44100,OutputSampleRate=16e3,...
Bandwidth=15800);
[~,D] = src.getRateChangeFactors;
% The frame size must be a multiple of 441 (the decimation factor of the
% sample rate converter)
L = floor(22000/D);
frameLength = L*D; % get as close to desired frame size
adr = audioDeviceReader(SampleRate=44100,SamplesPerFrame=frameLength);
buff = dsp.AsyncBuffer;
% These statistic values should come from your training...
M = 0;
S = 1;
while 1
% Read a frame of data from microphone
frame = adr();
% Convert to 16 KHz
frame = src(frame);
% Save to buffer
write(buff,frame)
while buff.NumUnreadSamples >= 512
frame = read(buff,512);
% Pass to network
scores = streamingClassifier(frame,M,S);
% Use the scores any way you want
end
end
Note that you can also potentially feed the network longer frames. That should also work, and is probably more efficient, as the network will run faster if you give it a long input (as opposed to multiple short ones):
% Create a microphone object
%adr = audioDeviceReader(SampleRate=16e3,SamplesPerFrame=512);
src = dsp.SampleRateConverter(InputSampleRate=44100,OutputSampleRate=16e3,Bandwidth=15800);
[~,D] = src.getRateChangeFactors;
% The frame size must be a multiple of 441 (the decimation factor of the
% sample rate converter)
L = floor(22000/D);
frameLength = L*D;
adr = audioDeviceReader(SampleRate=44100,SamplesPerFrame=frameLength);
buff = dsp.AsyncBuffer;
% These statistic values should come from your training...
M = 0;
S = 1;
while 1
% Read a frame of data from microphone
frame = adr();
% Convert to 16 KHz
frame = src(frame);
% Save to buffer
write(buff,frame)
N = buff.NumUnreadSamples;
L = floor(N/512);
if L>0
frame = read(buff,512*L);
% Pass to network
scores = streamingClassifier(frame,M,S);
% Use the scores any way you want
end
end
If you can't change the frame size on the microphone, then you can handle that using another buffer, for example:
% Create a microphone object
%adr = audioDeviceReader(SampleRate=16e3,SamplesPerFrame=512);
src = dsp.SampleRateConverter(InputSampleRate=44100,OutputSampleRate=16e3,Bandwidth=15800);
[~,D] = src.getRateChangeFactors;
% The frame size must be a multiple of 441 (the decimation factor of the
% sample rate converter)
L = floor(22000/D);
frameLength = L*D;
adr = audioDeviceReader(SampleRate=44100,SamplesPerFrame=22000);
buffSRC = dsp.AsyncBuffer;
buff = dsp.AsyncBuffer;
% These statistic values should come from your training...
M = 0;
S = 1;
while 1
% Read a frame of data from microphone
frame = adr();
write(buffSRC,frame);
frame = read(buffSRC,frameLength);
% Convert to 16 KHz
frame = src(frame);
% Save to buffer
write(buff,frame)
N = buff.NumUnreadSamples;
L = floor(N/512);
if L>0
frame = read(buff,512*L);
% Pass to network
scores = streamingClassifier(frame,M,S);
% Use the scores any way you want
end
end
Arslan Munim
2022년 8월 9일
Thankyou for your support, it was very helpful.
Now I want to use multiple mics for prediction can you please give me some idea how i can use streaming classifier with 3 or 4 mics of the predicition.
Thanks and have a nice day.
Regards,
Arslan
jibrahim
2022년 8월 9일
Hi Arslan,
audioDeviceReader supports multi-mic devices. Use the ChannelMappingSource and ChannelMapping properties to map between device input channels and the output data.
This network was trained on mono data, so, to adapt it to multi-channel data, you either have to retrain your network for multi-channel data, or somehow combine your input channels into one channel (by a weighted sum, or selecting a particular channel, etc) and proceed like above.
댓글 수: 23
Arslan Munim
2022년 8월 9일
편집: Arslan Munim
2022년 8월 9일
hello again,
Can i use same trained model for data coming from multiple Microphones? forexample in parallel is it possible that I can give data of multiple microphones to the model one by one and predict class for each indiviual microphone for the output.
jibrahim
2022년 8월 9일
Yes, I think this is possible. For example, here is how you do two predictions on two independent sets of features:
[airCompNet,scores] = predictAndUpdateState(airCompNet,{randn(10,1),randn(10,1)})
So, you can extract features from each channel, and then get scores for each channel
Arslan Munim
2022년 8월 9일
Hello again, thankyou for your reply. Can you please tell me in you reply
[airCompNet,scores] = predictAndUpdateState(airCompNet,{randn(10,1),randn(10,1)})
here {randn(10,1),randn(10,1)}, are feature extracted from 2 different channels?
Arslan Munim
2022년 8월 17일
편집: Walter Roberson
2022년 8월 19일
Hi jibrahim,
I try to read data from multiple mic but it is giving me this error everytime i try to use multiple mic, I am trying to read frame from each Microphone and send that data to streaming classifier to predict the output but it giving me error always on frame1 = adr1()
Error using audioDeviceReader/setup
A given audio device may only be opened once.
Error in audioDeviceReader/setupImpl
Error in multipleMic (line 10)
frame1 = adr1() - Show complete stack trace
adr1 = audioDeviceReader(SampleRate=44.1e3,SamplesPerFrame=22000, Device="Microphone (4- USB PnP Sound Device)",BitDepth="16-bit integer");
adr2 = audioDeviceReader(SampleRate=44.1e3,SamplesPerFrame=22000, Device="Microphone (USB PnP Sound Device)",BitDepth="16-bit integer");
% These statistic value should come from your training...
% M = 0;
% S = 1;
while 1
% Read a frame of data from microphone
frame1 = adr1()
frame2 = adr2()
% Pass to network
[class] = streamingClassifier2(frame1,frame2,M,S)
% Use the scores any way you want
end
function [class] = streamingClassifier2(frame1,frame2,M,S)
% This is a streaming classifier function
persistent net;
if isempty(net)
net = coder.loadDeepLearningNetwork('net.mat');
end
% Extract features using function
%features = extract(aFE , audioIn)
features1 = extractFeatures2(frame1);
features2 = extractFeatures2(frame2);
% Normalize
features1 = ((features1 - M)./S).';
features2 = ((features2 - M)./S).';
% Classify
[class] = classify(net,{features1,features2});
%[net, scores] = classify(net,feature)
end
jibrahim
2022년 8월 17일
You should not create two audio device readers. It seems like you are reading from the same (multichannel) device.Create one audioDeviceReader, and call it. It will return the output of each mic as a separate channel. Use the ChannelMappingSource and ChannelMapping properties to control which output channel corresponds to what microphone.
Arslan Munim
2022년 8월 17일
No, I am reading from two different devices (microphones). for that can you tell me should i create different audioReaders or use channel mapping source.
Please guide me and also if you can share some example that would also be great.
Thanks alot for your guidance.
Regards,
Arslan
jibrahim
2022년 8월 17일
Hi Arslan,
You can only create one audioDeviceReader at a time. You can't use multiple ones, so we support devices that return multiple channels. I suggest you create one object with either device name, and call it and see what you get back (how many channels?)
Arslan Munim
2022년 8월 17일
I tried doing that with only one device it is returning one frame from that particular microphone, can i make audioDeviceReader for each microphone read frames from each microphone and then classify frame read from each mic?
Jimmy Lapierre
2022년 8월 17일
Hi Arslan, just to clarify, do you have one USB sound card with several mics hooked up to it, or several USB microphones?
Arslan Munim
2022년 8월 17일
편집: Arslan Munim
2022년 8월 17일
Hi Jimmy, i have multiple usb microphones connected with my laptop
jibrahim
2022년 8월 19일
Arslan, we support the scenario with one USB card with several mics hooked to it. You can't use audioDeviceReader to read from separate cards at the same time. Even if we did, since these different mics run on different clocks, I am not sure how you would achieve synchronization between them anyway.
One possible workaround is to use a different MATLAB session to read from the other microphone, and send the data to MATLAB via UDP. So, in another MATLAB, run some code like this:
sender = dsp.UDPSender(RemoteIPPort=25000);
src = audioDeviceReader;
while(1)
frame = src();
sender(frame);
end
Them, in the main MATLAB, you can receive the audio:
rec = dsp.UDPReceiver(LocalIPPort=25000);
scope = timescope;
while(1)
frame = rec();
scope(frame);
end
This might work if your sound is in steady state and does not change often/fast. If synchronization between mics becomes an issue, then I think one card with multiple devices associated with it is definitely the way to go.
Arslan Munim
2022년 8월 19일
Hi Jibrahim, Thanks for your reply,
Can it work this way forexample: if I use one usb-hub and connect my microphones to my laptop with the help of USB-hub (so usb-hub has multiple microphones and that is connected to laptop) would it sync all the mics as all of the microphones will be connected to one usb card.
jibrahim
2022년 8월 19일
There should be one driver that aggregates the mics, so this will probably not work.
What device(s) are you working with?
Arslan Munim
2022년 8월 19일
I am using laptop to run Matlab where i am running the live script to predict using model.
I am using couple of these microphones connected to my laptop via usb: (amazon link below) just for your information.
yes there is one driver which is aggregating all the microphones in laptop.
https://www.amazon.de/Seacue-Omnidirektionaler-Kondensator-Interviews-Netzwerksingen/dp/B071171DBP/ref=asc_df_B071171DBP/?tag=googshopde-21&linkCode=df0&hvadid=310664364876&hvpos=&hvnetw=g&hvrand=8563425015364220221&hvpone=&hvptwo=&hvqmt=&hvdev=c&hvdvcmdl=&hvlocint=&hvlocphy=9068212&hvtargid=pla-378893081924&psc=1&th=1&psc=1&tag=&ref=&adgrpid=62550347900&hvpone=&hvptwo=&hvadid=310664364876&hvpos=&hvnetw=g&hvrand=8563425015364220221&hvqmt=&hvdev=c&hvdvcmdl=&hvlocint=&hvlocphy=9068212&hvtargid=pla-378893081924
jibrahim
2022년 8월 19일
If that is the case, then you should be able to open the aggregating device only once (with one audioDeviceReader) as a multichannel device. You should find out the name of the aggregating device and choose that one.
Arslan Munim
2022년 8월 19일
i am still getting only one column per frame (means output from only one device), can you suggest me some device that can aggregate all the microphones connected to it. also i am still getting individual microphones in the device list.
jibrahim
2022년 8월 19일
Perhaps this helps:
devinfo = audiodevinfo
See if there is another recongized device name you can use. My guess is that one name corresponds to the single USB card
Arslan Munim
2022년 8월 19일
i checked the input so there are indiviual input coming from each Microphone and have only one frame per input.
devinfo.input(2)
ans =
struct with fields:
Name: 'Microphone (7- USB PnP Sound Device):1 (Windows DirectSound)'
DriverVersion: 'Windows DirectSound'
ID: 1
devinfo.input(4)
ans =
struct with fields:
Name: 'Microphone (7- USB PnP Sound Device):2 (Windows DirectSound)'
DriverVersion: 'Windows DirectSound'
ID: 3
jibrahim
2022년 8월 19일
I am assuming you checked all the inputs (IDs 1, 2. ...etc). If each one corresponds to just one mic, then it would mean that MATLAB does not recognize the aggregating device. Check if your system (outside MATLAB) recongizes the aggregating device or not.
Arslan Munim
2022년 8월 19일
no, unfortunately not. its shows two different microphones outside MATLAB as well.
jibrahim
2022년 8월 20일
OK, this helps. You will need other hardware (one device, multiple mics) for the system to recognize it. You could also give the UDP idea a shot, see how viable that is.
Arslan Munim
2022년 9월 28일
Hi again,
I am trying to train my network, with lowering BitsPerSample to 8 before it was 16 BitsPerSample. Every time i try to start training model it throw warning (given below) and terminates.
I try it with different sample rate but it gives same error everytime. I tried to change my layer structure, changing InitialLearnRate',0.001 but still i am getting same warning.
Warning: Training stopped at iteration 1 because training loss is NaN. Predictions using the output network might contain NaN values.
Model:
layers = [ ...
sequenceInputLayer(size(trainingFeatures{1},1))
lstmLayer(100,"OutputMode","sequence")
dropoutLayer(0.1)
lstmLayer(100,"OutputMode","last")
fullyConnectedLayer(5)
softmaxLayer
classificationLayer];
miniBatchSize = 30;
validationFrequency = floor(numel(trainingFeatures)/miniBatchSize);
options = trainingOptions("adam", ...
"MaxEpochs",100, ...
"MiniBatchSize",miniBatchSize, ...
"Plots","training-progress", ...
"Verbose",false, ...
"Shuffle","every-epoch", ...
"LearnRateSchedule","piecewise", ...
"LearnRateDropFactor",0.1, ...
"LearnRateDropPeriod",20,...
'InitialLearnRate',0.001,...
'ValidationData',{validationFeatures,adsValidation.Labels}, ...
'ValidationFrequency',validationFrequency);
Regards,
Arslan
참고 항목
범주
Find more on Machine Learning and Deep Learning for Audio in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!오류 발생
페이지가 변경되었기 때문에 동작을 완료할 수 없습니다. 업데이트된 상태를 보려면 페이지를 다시 불러오십시오.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)