also, I have no problem for starting the network training on my local cpu with the library saved on a local folder. It really seems the problem comes from the data being on the cloud.
trainNetwork error unable to read file
조회 수: 1 (최근 30일)
이전 댓글 표시
HI all,
I am learning to train a convolutional network for image classification on the cloud. As a first step, I am following the example named "Train Network in the Cloud Using Automatic Parallel Support" on Mathworks.
I have started my cluster successfully and uploaded the cifar10 image library to my Amazon S3 bucket.
I then create succssefully the datastore using:
imdsTrain = imageDatastore('s3://mybucket/cifar10/train', ...
'IncludeSubfolders',true, ...
'LabelSource','foldernames');
My problem comes at the training level, where I use:
options = trainingOptions('sgdm', ...
'ExecutionEnvironment','parallel', ... % Turn on automatic parallel support.
'InitialLearnRate',initialLearnRate, ... % Set the initial learning rate.
'MiniBatchSize',miniBatchSize, ... % Set the MiniBatchSize.
'Verbose',true, ... % Do not send command line output.
'Plots','training-progress', ... % Turn on the training progress plot.
'L2Regularization',1e-10, ...
'MaxEpochs',50, ...
'Shuffle','every-epoch', ...
'ValidationData',imdsTest, ...
'ValidationFrequency',floor(numel(imdsTrain.Files)/miniBatchSize), ...
'LearnRateSchedule','piecewise', ...
'LearnRateDropFactor',0.1, ...
'LearnRateDropPeriod',45);
net = trainNetwork(augmentedImdsTrain,layers,options);
the training starts, the display of the training starts with the indication: "initializing input data normalization"
However it stops quickly with the error message:
Error in test_parallel_cloud (line 77)
net = trainNetwork(augmentedImdsTrain,layers,options);
Caused by:
Error using nnet.internal.cnn.DistributedDispatcher/computeInParallel (line
193)
Error detected on worker 1.
Error using matlab.io.datastore.ImageDatastore/read (line 77)
Unable to read file: 's3://mybucket/cifar10/train/deer/image35398.png'.
Error using matlab.io.datastore/DsFileReader (line 113)
Could not find file : s3://mybucket/cifar10/train/deer/image35398.png
every time I rerun the code it seems to stop on another image it cannot read. However the image is always on the bucket and do not seems to be corrupt when I check using imshow.
Can you see where the problem is?
댓글 수: 7
Fouzia Adjailia
2020년 5월 1일
hello,
I'm having a similar problem to yours and I would highly appreciate it if you can help me.
I created an image data store with a costumised read function called @formoccupancygrid, when I run my code using the parallel I get this error:
Error using classifyData (line 33)
Error detected on worker 1.
Caused by:
Error using matlab.io.datastore.ImageDatastore/readall (line 42)
Error using ReadFcn @UNKNOWN Function for file
D:\--*******************************
Undefined function handle.
I solved this problem using a parfevalOnAll, it excutes the function in all the workers. after that I have anotehr error which stats that the files don't exist, I added the files to the attached files and path in the additional path in the cluster profile manager but with no luck
looking forward to your reply.
Daniel Csata
2022년 10월 29일
Hi!
I just ran into this same exact problem. Could you please tell me exactly how you solved it with the parpool function? Because it seems like that didnt work for me or I did something wrong.
Thank you,
Daniel
답변 (1개)
Harsha Priya Daggubati
2020년 4월 7일
Hi,
Did you follow all the steps mentioned in the following documentation page:
참고 항목
카테고리
Help Center 및 File Exchange에서 Parallel and Cloud에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!