ROILabelData

Ground truth data for ROI labels

Description

The ROILabelData object stores ground truth data for region of interest (ROI) label definitions for each signal in a groundTruthMultisignal object.

Creation

When you export a groundTruthMultisignal object from a Ground Truth Labeler app session, the ROILabelData property of the exported object stores the ROI labels as an ROILabelData object. To create an ROILabelData object programmatically, use the vision.labeler.labeldata.ROILabelData function (described here).

Description

example

roiLabelData = vision.labeler.labeldata.ROILabelData(signalNames,labelData) creates an object containing ROI label data for multiple signals. The created object, roiLabelData, contains properties with the signal names listed in signalNames. These properties store the corresponding ROI label data specified by labelData.

Input Arguments

expand all

Signal names, specified as a string array. Specify the names of all signals present in the groundTruthMultisignal object you are creating. You can get the signal names from an existing groundTruthMultisignal object by accessing the DataSource property of that object. Use this command and replace gTruth with the name of your groundTruthMultisignal object variable.

gTruth.DataSource.SignalName

In an exported groundTruthMultisignal object, the ROILabelData object contains a label data property for each signal, even if some signals do not have ROI label data.

The properties of the created ROILabelData object have the names specified by signalNames.

Example: ["video_01_city_c2s_fcw_10s" "lidarSequence"]

ROI label data for each signal, specified as a cell array of timetables. Each timetable in the cell array contains data for the signal in the corresponding position of the signalNames input. The ROILabelData object stores each timetable in a property that has the same name as that signal.

The timetable format for each signal depends on data from the groundTruthMultisignal object that you exported or are creating.

Each timetable contains one column per label definition stored in the LabelDefinitions property of the groundTruthMultisignal object. Label definitions that the signal type does not support are excluded. For example, suppose you define a Line ROI label named 'lane'. The timetable for a lidar point cloud signal does not include a lane column, because these signals do not support Line ROI labels. In the DataSource property of the groundTruthMultisignal object, the SignalType property of each data source lists the valid signal types.

The height of the timetable is defined by the number of timestamps in the signal. In the DataSource property of the groundTruthMultisignal object, the Timestamp property of each data source lists the signal timestamps.

For each label definition, all ROI labels marked at that timestamps are combined into a single cell in the table. Consider the ROI label data for a video signal stored in a groundTruthMultisignal object, gTruth. At each timestamp, car contains three labels, truck contains one label, and lane contains two labels.

gTruth.ROILabelData.video_01_city_c2s_fcw_10s
ans =

  5×4 timetable

      Time           car            truck            lane    
    _________    ____________    ____________    ____________
    0 sec        {3×4 double}    {1×4 double}    {2×1 cell  }
    0.05 sec     {3×4 double}    {1×4 double}    {2×1 cell  }
    0.1 sec      {3×4 double}    {1×4 double}    {2×1 cell  }
    0.15 sec     {3×4 double}    {1×4 double}    {2×1 cell  }
    0.2 sec      {3×4 double}    {1×4 double}    {2×1 cell  }

The storage format for ROI label data depends on the label type.

Label TypeStorage Format for Labels at Each Timestamp
labelType.Rectangle

M-by-4 numeric matrix of the form [x, y, w, h], where:

  • M is the number of labels in the frame.

  • x and y specify the upper-left corner of the rectangle.

  • w specifies the width of the rectangle, which is its length along the x-axis.

  • h specifies the height of the rectangle, which is its length along the y-axis.

labelType.Cuboid

M-by-9 numeric matrix of the form [xctr, yctr, zctr, xlen, ylen, zlen, xrot, yrot, zrot], where:

  • M is the number of labels in the frame.

  • xctr, yctr, and zctr specify the center of the cuboid.

  • xlen, ylen, and zlen specify the length of the cuboid along the x-axis, y-axis, and z-axis, respectively.

  • xrot, yrot, and zrot specify the rotation angles for the cuboid along the x-axis, y-axis, and z-axis, respectively. These angles are clockwise-positive when looking in the forward direction of their corresponding axes.

The figure shows how these values determine the position of a cuboid.

labelType.Line

M-by-1 vector of cell arrays, where M is the number of labels in the frame. Each cell array contains an N-by-2 numeric matrix of the form [x1 y1; x2 y2; ... ; xN yN] for N points in the polyline.

labelType.PixelLabelLabel data for all pixel label definitions is stored in a single PixelLabelData column as a categorical label matrix. The label matrix must be stored on disk as a uint8 image. When creating an ROILabelData object with pixel label data, the image file name must be specified as a character vector in the labelData input. The label matrix must contain 1 or 3 channels. For a 3-channel matrix, the RGB pixel values represent label IDs.
labelType.CustomLabels are stored exactly as they are specified in the timetable. If you import a groundTruthMultisignal object containing custom label data into the Ground Truth Labeler app, this data is not imported into the app. Use custom data when gathering label data for training and combining it with data labeled in the app.

If the ROI label data includes sublabels or attributes, then the labels at each timestamp must be specified as structures instead. The table describes the fields of this structure.

Label Structure FieldDescription
Position

Positions of the parent labels at the given timestamp

The format of Position depends on the label type. These formats are described in the previous table.

AttributeName1,...,AttributeNameN

Attributes of the parent labels

Each defined sublabel has its own field, where the name of the field corresponds to the attribute name. The attribute value is a character vector for a List or String attribute, a numeric scalar for a Numeric attribute, or a logical scalar for a Logical attribute. If the attribute is unspecified, then the attribute value is an empty vector.

SublabelName1,...,SublabelNameN

Sublabels of the parent labels

Each defined sublabel has its own field, where the name of the field corresponds to the sublabel name. The value of each sublabel field is a structure containing the data for all marked sublabels with that name at the given timestamp.

This table describes the format of this sublabel structure.

Sublabel Structure FieldDescription
Position

Positions of the sublabels at the given timestamp

The format of Position depends on the label type. These formats are described in the previous table.

AttributeName1,...,AttributeNameN

Attributes of the sublabels

Each defined sublabel has its own field, where the name of the field corresponds to the attribute name. The attribute value is a character vector for a List or String attribute, a numeric scalar for a Numeric attribute, or a logical scalar for a Logical attribute. If you leave an attribute unspecified, then the attribute value is an empty vector.

Properties

expand all

ROI label data, specified as timetables. The ROILabelData object contains one property per signal, where each property contains a timetable of ROI label data corresponding to that signal.

When exporting an ROILabelData object from a Ground Truth Labeler app session, the property names correspond to the signal names stored in the DataSource property of the exported groundTruthMultisignal object.

When creating an ROILabelData object programmatically, the signalNames and labelData input arguments define the property names and values of the created object.

Suppose you want to create a groundTruthMultisignal object containing a video signal and a lidar point cloud sequence signal. Specify the signals in a string array, signalNames.

signalNames = ["video_01_city_c2s_fcw_10s" "lidarSequence"];

Store the video ROI labels, videoData, and lidar point cloud sequence ROI labels, lidarData, in a cell array of timetables, labelData. Each timetable contains the data for the corresponding signal in signalNames.

labelData = {videoData,lidarData}
  1×2 cell array

    {204×2 timetable}    {34×1 timetable}

The ROILabelData object, roiData, stores this data in the property with the corresponding signal name. You can specify roiData in the ROILabelData property of a groundTruthMultisignal object.

roiData = vision.labeler.labeldata.ROILabelData(signalNames,labelData)
roiData = 

  ROILabelData with properties:

    video_01_city_c2s_fcw_10s: [204×2 timetable]
                lidarSequence: [34×1 timetable]

Examples

collapse all

Create ground truth data for a video signal and a lidar point cloud sequence signal that captures the same driving scene. Specify the signal sources, label definitions, and ROI and scene label data.

Create the video data source from an MP4 file.

sourceName = '01_city_c2s_fcw_10s.mp4';
sourceParams = [];
vidSource = vision.labeler.loading.VideoSource;
vidSource.loadSource(sourceName,sourceParams);

Create the point cloud sequence source from a folder of point cloud data (PCD) files.

pcSeqFolder = fullfile(toolboxdir('driving'),'drivingdata','lidarSequence');
addpath(pcSeqFolder)
load timestamps.mat
rmpath(pcSeqFolder)

lidarSourceData = load(fullfile(pcSeqFolder,'timestamps.mat'));

sourceName = pcSeqFolder;
sourceParams = struct;
sourceParams.Timestamps = timestamps;

pcseqSource = vision.labeler.loading.PointCloudSequenceSource;
pcseqSource.loadSource(sourceName,sourceParams);

Combine the signal sources into an array.

dataSource = [vidSource pcseqSource]
dataSource = 

  1×2 heterogeneous MultiSignalSource (VideoSource, PointCloudSequenceSource) array with properties:

    SourceName
    SourceParams
    SignalName
    SignalType
    Timestamp
    NumSignals

Create a table of label definitions for the ground truth data by using a labelDefinitionCreatorMultisignal object.

  • The Car label definition appears twice. Even though Car is defined as a rectangle, you can draw rectangles only for image signals, such as videos. The labelDefinitionCreatorMultisignal object creates an additional row for lidar point cloud signals. In these signal types, you can draw Car labels as cuboids only.

  • The label definitions have no descriptions and no assigned colors, so the Description and LabelColor columns are empty.

  • The label definitions have no assigned groups, so for all label definitions, the corresponding cell in the Group column is set to 'None'.

  • Road is a pixel label definition, so the table includes a PixelLabelID column.

  • No label definitions have sublabels or attributes, so the table does not include a Hierarchy column for storing such information.

ldc = labelDefinitionCreatorMultisignal;
addLabel(ldc,'Car','Rectangle');
addLabel(ldc,'Lane','Line');
addLabel(ldc,'Road','PixelLabel');
addLabel(ldc,'Sunny','Scene');
labelDefs = create(ldc)
labelDefs =

  5×7 table

      Name       SignalType    LabelType      Group      Description    LabelColor    PixelLabelID
    _________    __________    __________    ________    ___________    __________    ____________

    {'Car'  }    Image         Rectangle     {'None'}       {' '}       {0×0 char}    {0×0 double}
    {'Car'  }    PointCloud    Cuboid        {'None'}       {' '}       {0×0 char}    {0×0 double}
    {'Lane' }    Image         Line          {'None'}       {' '}       {0×0 char}    {0×0 double}
    {'Road' }    Image         PixelLabel    {'None'}       {' '}       {0×0 char}    {[       1]}
    {'Sunny'}    Time          Scene         {'None'}       {' '}       {0×0 char}    {0×0 double}

Create ROI label data for the first frame of the video.

numVideoFrames = numel(vidSource.Timestamp{1});
carData = cell(numVideoFrames,1);
laneData = cell(numVideoFrames,1);
carData{1} = [304 212 37 33];
laneData{1} = [70 458; 311 261];
videoData = timetable(vidSource.Timestamp{1},carData,laneData, ...
                      'VariableNames',{'Car','Lane'});

Create ROI label data for the first point cloud in the sequence.

numPCFrames = numel(pcseqSource.Timestamp{1});
carData = cell(numPCFrames, 1);
carData{1} = [27.35 18.32 -0.11 4.25 4.75 3.45 0 0 0];
lidarData = timetable(pcseqSource.Timestamp{1},carData,'VariableNames',{'Car'});

Combine the ROI label data for both sources.

signalNames = [dataSource.SignalName];
roiData = vision.labeler.labeldata.ROILabelData(signalNames,{videoData,lidarData})
roiData = 

  ROILabelData with properties:

    video_01_city_c2s_fcw_10s: [204×2 timetable]
                lidarSequence: [34×1 timetable]

Create scene label data for the first 10 seconds of the driving scene.

sunnyData = seconds([0 10]);
labelNames = ["Sunny"];
sceneData = vision.labeler.labeldata.SceneLabelData(labelNames,{sunnyData})
sceneData = 

  SceneLabelData with properties:

    Sunny: [0 sec    10 sec]

Create a ground truth object from the signal sources, label definitions, and ROI and scene label data. You can import this object into the Ground Truth Labeler app for manual labeling or to run a labeling automation algorithm on it. You can also extract training data from this object for deep learning models by using the gatherLabelData function.

gTruth = groundTruthMultisignal(dataSource,labelDefs,roiData,sceneData)
gTruth = 

  groundTruthMultisignal with properties:

          DataSource: [1×2 vision.labeler.loading.MultiSignalSource]
    LabelDefinitions: [5×7 table]
        ROILabelData: [1×1 vision.labeler.labeldata.ROILabelData]
      SceneLabelData: [1×1 vision.labeler.labeldata.SceneLabelData]

Introduced in R2020a