How to split a dataset in 3 sets using splitEachLabel using percentage such that each class appears in all 3 sets?
조회 수: 6 (최근 30일)
이전 댓글 표시
I've an image dataset with around 100 classes and the maximum number of images for one class is 59 whereas the minimum is 5. I try to split the data into training, validation and testing by using the following statement
[imdsTrain,imdsValidation, imdsTest] = splitEachLabel(imds,0.75,0.15,'randomize');
I got the error that training and validation data must have same labels.
I checked the imds and found that for classes having less number of images like 5, it puts 4 in training and 1 sometimes either in validation set and some in test data set. So all classes that are in training are not found in validation or test data set.
I solved it by increaing the validation percent to 0.2 instead of 0.15 but it doesn't seem a good solution.
Is there a way to split the dataset such that all classes are present in all 3 datasets? Preferably I want to make it using percentages and don't want to use integer such that it puts always 1 image in validation and test dataset.
댓글 수: 0
답변 (1개)
Anmol Dhiman
2020년 7월 3일
편집: Anmol Dhiman
2020년 7월 3일
Hi Faisal,
The second arguement (0.75) in splitEachLabel is proportion representing proportion of files to split, specified as a scalar in the interval (0,1) or a positive integer scalar. You can change its value for your problem.
Regards,
Anmol Dhiman
참고 항목
카테고리
Help Center 및 File Exchange에서 Get Started with Deep Learning Toolbox에 대해 자세히 알아보기
제품
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!