Deep learning with partitionable datastores on a cluster
조회 수: 3 (최근 30일)
이전 댓글 표시
Hello,
I have a data store which contains 1000 .mat files. Each file contains a X*4 table which has the following format (see attached). 'X' is typically 700-900. The tf_ridge column is my data, for this study "sleep stage" is my lable of intrest.
MATLAB deep learning expects a n*2 table input; therefore I created a custom read function to read in the data and strip out the extra two colums and make my lable data categorical as shown below in mys custon read function;
% Calling ds as shown
ds = fileDatastore('C:\mydata',"ReadFcn",@custom_load_FN,"FileExtensions",".mat");
I also make a subset of the data for training and test purposes;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Create a subset of the datastore for test train val purposes
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% [Train, val, test] as a whole percentage i.e. [60,20,20]
split = [90,0,10];
[split_idx] = round(length(ds_org.Files)*(split/100));
train_idx = [1:split_idx(1)];
val_idx = [split_idx(1):split_idx(1)+split_idx(2)];
test_idx = [split_idx(1)+split_idx(2):split_idx(1)+split_idx(2)+split_idx(3)];
% Generate subset for train/test split; will inherit ds properties of
% isPartitionable
dstrain = subset(ds,train_idx);
dsval = subset(ds,val_idx);
dstest = subset(ds,test_idx);
% Custom read function to strip out arousal and epoch columns, and make
% lable categorical
function [a] = custom_load_FN(l)
disp('In load function')
load(l);
%disp(l)
data= removevars(data,{'Arousal','Epoch'});
valSet = {'N1' 'N2' 'N3' 'W' 'R'};
data.Sleep_Stage = categorical(data.Sleep_Stage,valSet);
a = data;
end
When I test this with;
tf = isPartitionable(ds)
MATLAB returns a logical 1; so the datastore is partitionable. However on the cluster I get the following error that the datastore is not partitionable.
The input datastore is not Partitionable and does not support parallel operations.
As a work around; I have also tried to use the @load handle and a transform datastore function which is just a rehash of my custom_load_FN however this has been unsuccessful. I am aware of this post, and this one. However it seems like there should be an easier soloution in my case. I just don't have enough experiance of working with datastores to know what this is.
If anyone has advice on how to make this type of datastore into a partitionable datastore with the ExecutionEnvironment="parallel" option for deep learning I would apprshate the advice!
options = trainingOptions("adam", ...
ExecutionEnvironment="parallel",
...
)
Kind regards,
Christopher
댓글 수: 3
Joss Knight
2023년 3월 16일
Ah yes, this is just an incorrect error message that was fixed in R2022a. I will answer now.
채택된 답변
Joss Knight
2023년 3월 16일
This error message is incorrect. It should say that your datastore is not PartionableByIndex. This was fixed in R2022a.
As long as your datastore is Subsettable you can now (since R2022b) work around this issue by using this Adapter I knocked together. No promises but it's mostly worked so far.
댓글 수: 0
추가 답변 (0개)
참고 항목
카테고리
Help Center 및 File Exchange에서 Pattern Recognition and Classification에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!