Saving images quickly for huge datasets

조회 수: 12 (최근 30일)
Joenam Coutinho
Joenam Coutinho . 2022년 4월 14일
댓글: Joenam Coutinho . 2022년 4월 15일
ads = audioDatastore(fulfolder, ...
'IncludeSubfolders',true, ...
ads.Files = natsortfiles(ads.Files);
fs = 44100;%sampling time for melspectrogram
for i = 1:length(myFolder)
[filepath,filename,extension] = fileparts(ads.Files{i});
readingdata = read(ads);
%Pre-process audio data
if width(readingdata)>1
readingdata = mean(readingdata,2);
if length(readingdata)<fs
readingdata = [readingdata;readingdata];
%Save spectrogram as image
function spectro(audiodata, fs, path)
colorbar ('off');
axis off;
%Crop spectrogram data only
file = [path,'.jpg'];
img = imread(file);
crop_im = imcrop(img,[115 50 675 535]);
I have written this code that saves the Melspectrogram image of each audio sample into a specified folder ad later crops it out.
My problem arises when I got 5136 audio samples, saving each image takes very long.
I would like to know if there is any other special and quicker way to get these images saved to my folder. I had kept my device running for almost two days and I am still saving the 1100th image.
Just like added a training process to my GPU is there a way I can sideload this work on my GPU.

답변 (2개)

Joss Knight
Joss Knight 2022년 4월 14일
It's hard to say what will speed things up, since we don't know which part of the process is slow. Is saving slow? Is computing the spectrogram slow? Try running the MATLAB profiler on a subset of the data to see where the bottlenecks are.
If it's file I/O that's slow you can try parallelizing using some parallel syntax such as parfor. You might also try using datastore writeall, for which you can define a WriteFcn, which would essentially be the code of your spectro function. writeall let's you set the UseParallel option to true.
If it's the spectrogram computation that's slow, and you have a GPU, maybe running on the GPU will help. Just move your data to the GPU, for instance, melSpectrogram(gpuArray(audiodata),fs).
  댓글 수: 1
Joss Knight
Joss Knight 2022년 4월 14일
Oh, I've noticed that you're saving a figure to disk, then loading it again in order to crop it using imcrop. This is highly inefficient. Do not use saveas, use print, and work with the options to axis, axes and print to get the output you're after.

댓글을 달려면 로그인하십시오.

jibrahim 2022년 4월 14일
Hi Joenam,
A couple of things I noticed in your code:
1) You rely on melSpectrogram to generate a plot for you, which is fine, but that will be a bottleneck, as you generate a plot for evey file. Perhaps returning the spectrogram (S = melSpectogram... will not generate a plot) and saving S to a file is faster
2) For each audio file, you write an image file, but then you read it, and then write it again. You would save time by pre-processing S, and writing the image file once, with no need to read it again.
  댓글 수: 1
Joenam Coutinho
Joenam Coutinho 2022년 4월 15일
I am not quite clear with the point no.2.
I tried cropping the melspectrogram before saving it inorder to save time between reading and writing. But i am unable to feed S into imcrop. I gives me an error. 'Expected DATA to be nonempty.
I feel I am doing something wrong but do not know where I am going wrong

댓글을 달려면 로그인하십시오.


Help CenterFile Exchange에서 Image Data Workflows에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by