Main Content

pixelLabelImageDatastore

(To be removed) Datastore for semantic segmentation networks

Description

pixelLabelImageDatastore will be removed in a future release. Use the imageDatastore and pixelLabelDatastore objects and the combine function instead.

Creation

Description

example

pximds = pixelLabelImageDatastore(gTruth) returns a datastore for training a semantic segmentation network based on the input groundTruth object or array of groundTruth objects. Use the output pixelLabelImageDatastore object with the Deep Learning Toolbox™ function trainNetwork (Deep Learning Toolbox) to train convolutional neural networks for semantic segmentation.

pximds = pixelLabelImageDatastore(imds,pxds) returns a datastore based on the input image datastore and the pixel label datastore objects. imds is an ImageDatastore object that represents the training input to the network. pxds is a PixelLabelDatastore object that represents the required network output.

pximds = pixelLabelImageDatastore(___,Name,Value) additionally uses name-value pairs to set the DispatchInBackground and OutputSizeMode properties. For 2-D data, you can also use name-value pairs to specify the ColorPreprocessing, DataAugmentation, and OutputSize augmentation properties. You can specify multiple name-value pairs. Enclose each property name in quotes.

For example, pixelLabelImageDatastore(gTruth,'PatchesPerImage',40) creates a pixel label image datastore that randomly generates 40 patches from each ground truth object in gTruth.

Input Arguments

expand all

Ground truth data, specified as a groundTruth object or as an array of groundTruth objects. Each groundTruth object contains information about the data source, the list of label definitions, and all marked labels for a set of ground truth labels.

Collection of images, specified as an ImageDatastore object.

Collection of pixel labeled images, specified as a PixelLabelDatastore object. The object contains the pixel labeled images for each image contained in the imds input object.

Properties

expand all

This property is read-only.

Image file names used as the source for ground truth images, specified as a character vector or a cell array of character vectors.

This property is read-only.

Pixel label data file names used as the source for ground truth label images, specified as a character or a cell array of characters.

This property is read-only.

Class names, specified as a cell array of character vectors.

Color channel preprocessing for 2-D data, specified as 'none', 'gray2rgb', or 'rgb2gray'. Use this property when you need the image data created by the data source must be only color or grayscale, but the training set includes both. Suppose you need to train a network that expects color images but some of your training images are grayscale. Set ColorPreprocessing to 'gray2rgb' to replicate the color channels of the grayscale images in the input image set. Using the 'gray2rgb' option creates M-by-N-by-3 output images.

The ColorPreprocessing property is not supported for 3-D data. To perform color channel preprocessing of 3-D data, use the transform function.

Preprocessing applied to input images, specified as an imageDataAugmenter (Deep Learning Toolbox) object or 'none'. When DataAugmentation is 'none', no preprocessing is applied to input images. Training data can be augmented in real-time during training.

The DataAugmentation property is not supported for 3-D data. To preprocess 3-D data, use the transform function.

Dispatch observations in the background during training, prediction, and classification, specified as false or true. To use background dispatching, you must have Parallel Computing Toolbox™. If DispatchInBackground is true and you have Parallel Computing Toolbox, then pixelLabelImageDatastore asynchronously reads patches, adds noise, and queues patch pairs.

Number of observations that are returned in each batch. The default value is equal to the ReadSize of image datastore imds. You can change the value of MiniBatchSize only after you create the datastore. For training, prediction, or classification, the MiniBatchSize property is set to the mini-batch size defined in trainingOptions (Deep Learning Toolbox).

This property is read-only.

Total number of observations in the denoising image datastore. The number of observations is the length of one training epoch.

This property is read-only.

Size of output images, specified as a vector of two positive integers. The first element specifies the number of rows in the output images, and the second element specifies the number of columns. When you specify OutputSize, image sizes are adjusted as necessary. By default, this property is empty, which means that the images are not adjusted.

The OutputSize property is not supported for 3-D data. To set the output size of 3-D data, use the transform function.

Method used to resize output images, specified as one of the following. This property applies only when you set OutputSize to a value other than [].

  • 'resize' — Scale the image to fit the output size. For more information, see imresize.

  • 'centercrop' — Take a crop from the center of the training image. The crop has the same size as the output size.

  • 'randcrop' — Take a random crop from the training image. The random crop has the same size as the output size.

Data Types: char | string

Object Functions

combineCombine data from multiple datastores
countEachLabelCount occurrence of pixel or box labels
hasdataDetermine if data is available to read
partitionByIndexPartition pixelLabelImageDatastore according to indices
previewPreview subset of data in datastore
readRead data from a datastore
readallRead all data in datastore
readByIndexRead data specified by index from pixelLabelImageDatastore
resetReset datastore to initial state
shuffleReturn shuffled version of datastore
transformTransform datastore

Examples

collapse all

Load the training data.

dataSetDir = fullfile(toolboxdir('vision'),'visiondata','triangleImages');
imageDir = fullfile(dataSetDir,'trainingImages');
labelDir = fullfile(dataSetDir,'trainingLabels');

Create an image datastore for the images.

imds = imageDatastore(imageDir);

Create a pixelLabelDatastore for the ground truth pixel labels.

classNames = ["triangle","background"];
labelIDs   = [255 0];
pxds = pixelLabelDatastore(labelDir,classNames,labelIDs);

Visualize training images and ground truth pixel labels.

I = read(imds);
C = read(pxds);

I = imresize(I,5);
L = imresize(uint8(C{1}),5);
imshowpair(I,L,'montage')

Create a semantic segmentation network. This network uses a simple semantic segmentation network based on a downsampling and upsampling design.

numFilters = 64;
filterSize = 3;
numClasses = 2;
layers = [
    imageInputLayer([32 32 1])
    convolution2dLayer(filterSize,numFilters,'Padding',1)
    reluLayer()
    maxPooling2dLayer(2,'Stride',2)
    convolution2dLayer(filterSize,numFilters,'Padding',1)
    reluLayer()
    transposedConv2dLayer(4,numFilters,'Stride',2,'Cropping',1);
    convolution2dLayer(1,numClasses);
    softmaxLayer()
    pixelClassificationLayer()
    ];

Setup training options.

opts = trainingOptions('sgdm', ...
    'InitialLearnRate',1e-3, ...
    'MaxEpochs',100, ...
    'MiniBatchSize',64);

Combine the image and pixel label datastore for training.

trainingData = combine(imds,pxds);

Train the network.

net = trainNetwork(trainingData,layers,opts);
Training on single CPU.
Initializing input data normalization.
|========================================================================================|
|  Epoch  |  Iteration  |  Time Elapsed  |  Mini-batch  |  Mini-batch  |  Base Learning  |
|         |             |   (hh:mm:ss)   |   Accuracy   |     Loss     |      Rate       |
|========================================================================================|
|       1 |           1 |       00:00:00 |       58.11% |       1.3458 |          0.0010 |
|      17 |          50 |       00:00:12 |       97.30% |       0.0924 |          0.0010 |
|      34 |         100 |       00:00:24 |       98.09% |       0.0575 |          0.0010 |
|      50 |         150 |       00:00:37 |       98.56% |       0.0424 |          0.0010 |
|      67 |         200 |       00:00:49 |       98.48% |       0.0435 |          0.0010 |
|      84 |         250 |       00:01:02 |       98.66% |       0.0363 |          0.0010 |
|     100 |         300 |       00:01:14 |       98.90% |       0.0310 |          0.0010 |
|========================================================================================|
Training finished: Reached final iteration.

Read and display a test image.

testImage = imread('triangleTest.jpg');
imshow(testImage)

Segment the test image and display the results.

C = semanticseg(testImage,net);
B = labeloverlay(testImage,C);
imshow(B)

Tips

  • The pixelLabelDatastore pxds and the imageDatastore imds store files that are located in a folder in lexicographical order. For example, if you have twelve files named 'file1.jpg', 'file2.jpg', … , 'file11.jpg', and 'file12.jpg', then the files are stored in this order:

    'file1.jpg'
    'file10.jpg'
    'file11.jpg'
    'file12.jpg'
    'file2.jpg'
    'file3.jpg'
    ...
    'file9.jpg'
    Files that are stored in a cell array are read in the same order as they are stored.

    If the order of files in pxds and imds are not the same, then you may encounter a mismatch when you read a ground truth image and corresponding label data using a pixelLabelImageDatastore. If this occurs, then rename the pixel label files so that they have the correct order. For example, rename 'file1.jpg', … , 'file9.jpg' to 'file01.jpg', …, 'file09.jpg'.

  • To extract semantic segmentation data from a groundTruth object generated by the Video Labeler, use the pixelLabelTrainingData function.

Version History

Introduced in R2018a

expand all

Not recommended starting in R2022b_plus