Main Content

Aerial Lidar Semantic Segmentation Using RandLANet Deep Learning

This example shows how to train a RandLANet deep learning network to perform semantic segmentation on aerial lidar data.

You can use lidar data acquired from airborne laser scanning systems in applications such as topographic mapping, city modeling, biomass measurement, and disaster management. Extracting meaningful information from this data requires semantic segmentation, a process in which you assign each point in the point cloud a unique class label.

In this example, you train a RandLANet [1] network to perform semantic segmentation by using the Dayton Annotated Lidar Earth Scan (DALES) data set [2]. The data set contains scenes of dense, labeled aerial lidar data from urban, suburban, rural, and commercial settings. The data set provides semantic segmentation labels for eight classes: buildings, cars, trucks, poles, power lines, fences, ground, and vegetation.

Load DALES Data

The DALES data set contains 40 scenes of aerial lidar data. Out of the 40 scenes, 29 scenes are used for training and the remaining 11 scenes are used for testing. Follow the instructions on the DALES website to download the data set to the folder specified by the dataFolder variable. Create folders to store the training and test data.

dataFolder = fullfile(tempdir,"DALES");
trainDataFolder = fullfile(dataFolder,"dales_las","train");
testDataFolder = fullfile(dataFolder,"dales_las","test");

Preview a point cloud from the training data.

lasReader = lasFileReader(fullfile(trainDataFolder,"5080_54435.las"));
[pc,attr] = readPointCloud(lasReader,"Attributes","Classification");
labels = attr.Classification;

% Select only labeled data.
pc = select(pc,labels~=0);
labels = labels(labels~=0);
classNames = ["ground"; ...
    "vegetation"; ...
    "cars"; ...
    "trucks"; ...
    "powerlines"; ...
    "fences"; ...
    "poles"; ...
    "buildings"];
figure
ax = pcshow(pc.Location,labels);
helperLabelColorbar(ax,classNames)
title("Point Cloud with Overlaid Semantic Labels")

Preprocess Training Data

To preprocess the training data, first create a fileDatastore object to collect all the point cloud files in the training data.

fds = fileDatastore(trainDataFolder,"ReadFcn",@lasFileReader);

Apply a transformation that performs these taks by using the helperTransformTrainData helper function, defined at the end of this example:

  • Each point cloud in the DALES data set covers an area of 500-by-500 meters, which is much larger than the typical area covered by terrestrial rotating lidar point clouds. For efficient memory processing, divide the point cloud into small, non-overlapping blocks of size 50-by-50 meters.

  • Normalize the point cloud values to a range of [0, 1].

  • Downsample the point clouds and labels using grid subsampling to equalize the density of the point clouds.

Define the block dimensions.

blockSize = [50 50];

Apply the transformation to the training datastore.

ldsTransformed = transform(fds,@(x) helperTransformTrainData(x,blockSize));

Use the helperDatastoreWriteFcn helper function, attached to this example as a supporting file, to save the preprocessed point clouds and semantic labels as PCD and PNG files, respectively. This step substantially speeds up the training process. You can set writeFiles to false if your preprocessed training data is already present in the outputFolder.

writeFiles = true;
outputFolder = fullfile(dataFolder,"PreprocessedTrainData");
if writeFiles
    writeall(ldsTransformed,outputFolder,WriteFcn = @helperDatastoreWriteFcn,FolderLayout = "flatten");
end
Starting parallel pool (parpool) using the 'local' profile ...
Connected to parallel pool with 6 workers.
Processing done for file: 5190_54400.las

Note: Processing can take some time. The code suspends MATLAB® execution until processing is complete.

If you want to rerun the code, you must first use the "clear helperDatastoreWriteFcn" command.

Calculate the point distribution across all the classes in the training data set using the helperComputeClassWeights helper function, attached to this example as a supporting file. The calculated weights are applied to the weighted cross-entropy loss during training.

numClasses = numel(classNames);
weights = helperComputeClassWeights(fds,numClasses);

Create Datastore Object for Training

Create a fileDatastore object to store and read PCD files by using the pcread function.

pcCropTrainPath = fullfile(outputFolder,"PointCloud");
ldsTrain = fileDatastore(pcCropTrainPath,"ReadFcn",@(x) pcread(x));

Use a pixelLabelDatastore object to store pixel-wise labels from the pixel label images. The object maps each pixel label to a class name and assigns a unique label ID to each class.

% Specify label IDs from 1 to the number of classes.
labelIDs = 1 : numClasses;
labelCropTrainPath = fullfile(outputFolder,"Labels");
pxdsTrain = pixelLabelDatastore(labelCropTrainPath,classNames,labelIDs);

Use the helperPreprocessTrainData helper function, attached to this example as a supporting file preprocess training data to make it compatible with the network input layer.

preprocessedTrainingData = transform(ldsTrain,pxdsTrain,@(pt,lb) helperPreprocessTrainData(pt,lb,classNames));

Define RandLANet Network

RandLANet is a popular neural network used for semantic segmentation of unorganized lidar point clouds. This network uses an encoder-decoder architecture with skip connections. You feed the input to a shared multilayer perceptron layer, followed by four encoding and decoding layers, to learn features for each point. At end, three fully connected layers and a dropout layer associate each point in a 3-D point cloud with a class label, such as car, truck, ground, or vegetation.

This figure shows the architecture of a RandLANet, where:

  • FC — Fully Connected Layer

  • LFA — Local Feature Aggregation

  • RS — Random Sampling

  • MLP — Shared Multilayer Perceptron

  • US — Upsampling

  • DP — Dropout

2023-07-13_17-45-41.png

Each encoding layer randomly downsamples the point cloud, which significantly decreases the point density. To retain the key features of the downsampled point cloud, the network uses a local feature aggregation module on each point.

Each decoding layer upsamples the point feature set through nearest-neighbor interpolation. The network concatenates these upsampled maps with feature maps produced by the encoding layers through skip connections, and applies a shared MLP to the concatenated feature maps.

Define the RandLANet network using the helperLoadRandLANet helper function, attached to this example as a supporting file.

% RandLANet dlnetwork
net = helperLoadRandLANet(numClasses);

Specify Training Options

Use the Adam optimization algorithm to train the network. Use the trainingOptions (Deep Learning Toolbox) function to specify the hyperparameters.

learningRate = 0.01;
numEpochs = 100;
miniBatchSize = 8;
learnRateDropFactor = 0.9886;
executionEnvironment = "auto";
dataFormat = cell(1,26);
dataFormat(:) = {'SSCB'}; 

options = trainingOptions("adam", ...
    InitialLearnRate = learningRate, ...
    MaxEpochs = numEpochs, ...
    MiniBatchSize = miniBatchSize, ...
    LearnRateSchedule = "piecewise", ...
    LearnRateDropPeriod = 1,...
    LearnRateDropFactor = learnRateDropFactor, ...
    Plots = "training-progress", ...
    ExecutionEnvironment = executionEnvironment, ...
    ResetInputNormalization = false, ...
    CheckpointFrequencyUnit = "epoch", ...
    CheckpointFrequency = 10, ... 
    CheckpointPath = tempdir, ...
    InputDataFormats = dataFormat(1:25), ...
    TargetDataFormats = dataFormat(26));

Note: Reduce or increase the miniBatchSize value to control memory usage when training.

Train Model

To train the network, set the doTraining argument to true. Otherwise, load a pretrained network. To train the network, you can use a CPU or GPU. Using a GPU requires Parallel Computing Toolbox™ and a CUDA® enabled NVIDIA® GPU. For more information, see GPU Computing Requirements (Parallel Computing Toolbox). Use weighted cross-entropy loss to update the network during training using the helperLossFunction helper function, defined at the end of this example.

doTraining = false;
if doTraining
    % Train the network on the ldsTransformed datastore using 
    % the trainnet function.
    trainedNet = trainnet(preprocessedTrainingData,net,@(ypred,ytrue) helperLossFunction(ypred,ytrue,weights),options);
else
    % Load the pretrained network.
    load("randlanetSegTrained","net")
end

Segment Aerial Point Cloud

The network has been trained on downsampled point clouds. To perform segmentation on a test point cloud, first downsample the test point cloud, similar to the training data. Perform inference on the downsampled data to compute prediction labels. Interpolate these prediction labels to obtain prediction labels on the dense point cloud.

Read the full test point cloud.

lasReader = lasFileReader(fullfile(testDataFolder,"5080_54470.las"));
[ptCloud,attr] = readPointCloud(lasReader,"Attributes","Classification");
labels = attr.Classification;

Apply a transformation that performs these tasks to the test or validation data by using the helperTransformTestData helper function, attached to this example as a supporting file:

  • Crop the point clouds into non-overlapping blocks of size 50-by-50 meters.

  • Normalize the point cloud values to a range of [0, 1].

  • Downsample the point clouds using grid subsampling to equalize the density of point clouds.

  • For each point in the the dense point cloud, find the 1-nearest neighbor in the downsampled point cloud, to facilitate interpolation of the labels in postprocessing.

[transformedTestData,blockIndices] = helperTransformTestData(ptCloud,blockSize);

Perform inference on the preprocessed test data.

% Initialize prediction labels. 
labelsDensePred = zeros(size(ptCloud.Location,1),1);

for i = 1:size(transformedTestData,2)
    ptCloudSparse = transformedTestData{1,i}{1};
    kdtree = transformedTestData{1,i}{2};
    projectedIndices = transformedTestData{1,i}{3};

    % Randomly generate a value for each point, which represents the 
    % probability of a point being selected to perform inference on it.
    possibility = rand(ptCloudSparse.Count,1)*1e-3;

    predLabels = zeros(ptCloudSparse.Count,1,"int16");

    % This flag is set to 'false' when all points in ptCloudSparse 
    % are covered for inference.
    flag = true;

    while flag
        % Use the helperPreprocessTestData helper function, attached to this 
        % example as a supporting file, for preprocessing.
        [preprocessedInput,possibility] = helperPreprocessTestData(ptCloudSparse, ...
            kdtree,possibility,executionEnvironment);

        % Use the helperPredictRandLANet helper function, attached to this example as
        % a supporting file, for dlnetwork output.
        labelsSparsePred = helperPredictRandLANet(net,preprocessedInput);

        % Use the helperPostprocessRandLANet helper function, attached to this example 
        % as a supporting file, for postprocessing.
        [interpolatedLabels,flag,predLabels] = helperPostprocessRandLANet(labelsSparsePred, ...
            predLabels,possibility,preprocessedInput{1,6},projectedIndices,flag);
    end
    
    labelsDensePred(transformedTestData{1,i}{4}) = interpolatedLabels;
end

For better visualisation display only a block inferred from the point cloud data.

figure
pc = select(ptCloud,transformedTestData{1,i}{4});
ax = pcshow(pc.Location,interpolatedLabels);
axis off
helperLabelColorbar(ax,classNames)
title("Point Cloud Overlaid with Detected Semantic Labels")

Evaluate Network

To perform evaluation on the test data, get the labels from the test point cloud. Use the predicted labels you computed and are stored in labelsDensePred.

Compute the ground truth labels, and use blockIndices to reorder the ground truth labels to match the order of the predicted labels.

labelsDenseTarget = labels(blockIndices);

Use the evaluateSemanticSegmentation function to compute the semantic segmentation metrics from the test set results.

confusionMatrix = segmentationConfusionMatrix(double(labelsDensePred), ...
    double(labelsDenseTarget),Classes = 1:numClasses);
metrics = evaluateSemanticSegmentation({confusionMatrix},classNames,Verbose = false);

You can measure the amount of overlap per class using the intersection-over-union (IoU) metric.

The evaluateSemanticSegmentation function returns metrics for the entire data set, for individual classes, and for each test image. To see the metrics at the data set level, inspect the metrics.DataSetMetrics property.

metrics.DataSetMetrics
ans=1×4 table
    GlobalAccuracy    MeanAccuracy    MeanIoU    WeightedIoU
    ______________    ____________    _______    ___________

       0.94533          0.78257       0.60613      0.90656  

The data set metrics provide a high-level overview of network performance. To see the impact each class has on the overall performance, view the metrics for each class by inspecting the metrics.ClassMetrics property.

metrics.ClassMetrics
ans=8×2 table
                  Accuracy      IoU   
                  ________    ________

    ground        0.97851      0.95348
    vegetation    0.87118      0.85004
    cars          0.86573      0.50844
    trucks        0.10772     0.053908
    powerlines      0.919      0.75115
    fences        0.78975      0.35045
    poles         0.75789      0.46694
    buildings      0.9708      0.91463

Supporting Functions

The helperLossFunction function performs the weighted cross-entropy loss to handle the class imbalance on the DALES data set. This penalizes the network more if it misclassifies a point that belongs to a class with lower weight.

function loss = helperLossFunction(ypred,yactual,weights)

% Apply softmax on prediction.
ypred = softmax(ypred);

% Compute weighted cross-entropy loss.
loss = crossentropy(ypred,yactual,weights,WeightsFormat="UC",Reduction="none");
loss = mean(mean(sum(loss,3),1),4);

end

The helperTransformTrainData function applies these transformations to the training data:

  • Each point cloud in the DALES data set covers an area of 500-by-500 meters, which is much larger than the typical area covered by terrestrial rotating lidar point clouds. For efficient memory processing, divide the point cloud into small, non-overlapping blocks of size 50-by-50 meters.

  • Normalize the point cloud values to a range of [0, 1].

  • Downsample the point clouds and labels using grid subsampling to equalize the density of the point clouds.

function out = helperTransformTrainData(lasReader,blocksize)

[ptCloud,attr] = readPointCloud(lasReader,"Attributes","Classification");
labels = attr.Classification;

% Select only labeled data.
ptCloud = select(ptCloud,labels~=0);
labels = labels(labels~=0);

% Block the input point cloud.
numGridsX = round((diff(ptCloud.XLimits)+eps)/blocksize(1));
numGridsY = round((diff(ptCloud.YLimits)+eps)/blocksize(2));
[~,~,~,indx,indy] = histcounts2(ptCloud.Location(:,1),ptCloud.Location(:,2), ...
    [numGridsX, numGridsY],XBinLimits = ptCloud.XLimits, YBinLimits = ptCloud.YLimits);
ind = sub2ind([numGridsX, numGridsY],indx,indy);

out = cell(1,numGridsX*numGridsY);

    for num = 1:numGridsX*numGridsY
        idx = ind == num;
    
        % Blocked point cloud and corresponding labels.
        ptCloudDense = select(ptCloud,idx);
        labelsDense= labels(idx);
    
        % Use the helperPointCloudNormalization helper function, attached to this example as a
        % supporting file, to normalize the point cloud values to a range of [0,1].
        ptCloudNormalized = helperPointCloudNormalization(ptCloudDense);
    
        % Use the helperGridSubsampling helper function, attached to this example as a supporting file,
        % to downsample the point cloud and labels.
        gridSize = 0.04;
        [pointsSparse,labelsSparse] = helperGridSubsampling(ptCloudNormalized,labelsDense,gridSize);
    
        xyzAndLabelsSparse = [pointsSparse,labelsSparse];
        out{1,num} = xyzAndLabelsSparse;
    end

end

References

[1] Hu, Qingyong, Bo Yang, Linhai Xie, Stefano Rosa, Yulan Guo, Zhihua Wang, Niki Trigoni, and Andrew Markham. “RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds.” In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 11105–14. Seattle, WA, USA: IEEE, 2020. https://doi.org/10.1109/CVPR42600.2020.01112.

[2] Varney, Nina, Vijayan K. Asari, and Quinn Graehling. “DALES: A Large-Scale Aerial LiDAR Data Set for Semantic Segmentation.” In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 717–26. Seattle, WA, USA: IEEE, 2020. https://doi.org/10.1109/CVPRW50498.2020.00101.

See Also

Apps

Related Topics