augmentedImageDatastore for image segmentation

조회 수: 33 (최근 30일)
Alexander Resare
Alexander Resare 2024년 3월 8일
댓글: Matt J 2024년 4월 1일 22:12
Hello,
I wish to create an augmented image datastore that I can use in training. Previously I used to augment all image pairs with my own custom function before training but then my project supervisor gave me the idea to augment during training to let the network see many more different images. I understand another approach would be to just augment even more images before training and decrease the number of epochs but I wish to succeed using MATLAB's built in augmenter as well. Here is the problem I am facing:
size(X_train) = [224 224 3 200]
size(Y_train) = [224 224 200]
For the provided example in MATLAB's documentation of augmentedImageDatastore, Y_train is just a 1D categorical array. In my case, I need to augment the X data as well as the Y data, with the same augmentation on each pair. I tried something like this:
%% Built-in augmenter
imageAugmenter = imageDataAugmenter( ...
'RandRotation',[0 360], ...
'RandXTranslation',[-5 5], ...
'RandYTranslation',[-5 5], ...
'RandXReflection', true, ...
'RandYReflection', true );
training = combine(ds_X_training, ds_Y_training);
aug_training = augmentedImageDatastore([224 224 3], training, 'DataAugmentation', imageAugmenter);
And I get the error:
This works fine, however:
X_aug_training = augmentedImageDatastore([224 224 3], ds_X_aug_training, 'DataAugmentation', imageAugmenter);
I understand the error arrises because I can't feed a combined datastore or pixelLabelDatastore into augmentedImageDatastore. I saw some examples on augmentation of pixellabel images; Augment Pixel Labels for Semantic Segmentation but the article did not mention anything about augmentedImageDatastore, which is the one I am interested in because it wont save augmented images in memory while training.

채택된 답변

Matt J
Matt J 2024년 3월 8일
편집: Matt J 2024년 3월 8일
Supply the training data in numeric form:
X_training = rand([224 224 3 200]) ; %Fake
Y_training = rand([224 224 1 200]) ; %Fake
imageAugmenter = imageDataAugmenter( ...
'RandRotation',[0 360], ...
'RandXTranslation',[-5 5], ...
'RandYTranslation',[-5 5], ...
'RandXReflection', true);
aug_training = augmentedImageDatastore([224 224], X_training, Y_training,...
'DataAugmentation', imageAugmenter)
aug_training =
augmentedImageDatastore with properties: NumObservations: 200 MiniBatchSize: 128 DataAugmentation: [1x1 imageDataAugmenter] ColorPreprocessing: 'none' OutputSize: [224 224] OutputSizeMode: 'resize' DispatchInBackground: 0
  댓글 수: 1
Alexander Resare
Alexander Resare 2024년 3월 9일
Thank you, I can now load aug_training into memory. When I aim to use it however, I get a new error:
options = trainingOptions('adam', ...
'ExecutionEnvironment', 'gpu', ...
'MaxEpochs', 12, ...
'MiniBatchSize', 16, ...
'ValidationData', validation, ...
'InitialLearnRate',0.001, ...
'LearnRateSchedule','piecewise', ...
'LearnRateDropFactor', 0.2, ...
'LearnRateDropPeriod', 3, ...
'ValidationFrequency', 10, ...
'OutputFcn', @(info) opt_func(info));
trained_net = trainNetwork(aug_training, lgraph, options);
I then tried to convert Y_training from arrays of numeric labels to a categorical array, which my supervisor had recomended me earlier in order to avoid unexpected values in the groundtruth data, resulting from an inappropriate interpolation method. Then I was faced with another error, even before attempting to train:
It doesn't make sense to me that when I pass Y_train as a categorical 224x224x1x200 array, it complains about Y_train not being a vector. I assume augmentedImageDatastore expects a 1D vector containing single labels of each image. Do you know how I can work around this issue for my segmentation task?
If useful knowledge, here are some details regarding the end layer of the network:
classNames = ["live" "dead" "background"];
classWeights = [20 20 1];
numClasses = 3;
imageSize = [224 224 3];
labelIDs = [255, 128, 0];
network = 'mobilenetv2';
lgraph = deeplabv3plusLayers(imageSize,numClasses,network);
end_layer = genDiceLossPixelClassificationLayer('end_layer', classWeights, labelIDs, classNames, true);
lgraph = replaceLayer(lgraph, 'classification', end_layer);
So in essence I am contemplating wether each pixel in the Y images should be 255, 128 or 0 / "live", "dead" or "background" before augmentation.

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

Birju Patel
Birju Patel 2024년 4월 1일 19:44
I recommend combining imageDatastore and pixelLabelDatastore and then using a transform to implement data augmentation for semantic segmentation.
Here is an example:
augmentedImageDatastore was not designed to augment data for semantic segmentation.
  댓글 수: 1
Matt J
Matt J 2024년 4월 1일 22:12
Unfortunately, though, this requires the Computer Vision Toolbox.

댓글을 달려면 로그인하십시오.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by