Object Detection in Point Clouds Using Deep Learning
3-D object detection has great significance in autonomous navigation, robotics, medicine, remote sensing, and augmented reality applications. Though point clouds provide rich 3-D information, object detection in point clouds is a challenging task due to the sparse and unstructured nature of data.
Using deep neural networks to detect objects in a point cloud provides fast and accurate results. A 3-D object detector network takes point cloud data as an input and produces 3-D bounding boxes around each detected object.
These are few popular methods of object detection based on the network input.
Input different point cloud views such as Bird's-eye-view (BEV), front-view, or image view to a network and regress 3-D bounding boxes. You can also fuse features from different views for more accurate detections.
Convert point cloud data into a more structured representation such as pillars or voxels, then apply a 3-D convolutional neural network to obtain bounding boxes. PointPillars and VoxelNet are widely popular networks using this method.
VoxelNet converts the point cloud data into equally spaced voxels and encodes features within each voxel into a 4-D tensor. Then, obtains the detection results by using a region proposal network.
PointPillars network uses PointNets to learn the features of the point cloud organized into vertical pillars. The network then encodes these features as pseudo images to predict bounding boxes by using a 2-D object detection pipeline. For more information, see Get Started with PointPillars.
Preprocess point cloud data to derive a 2-D representation, use a 2-D CNN to obtain 2-D bounding boxes. Then, project these 2-D boxes onto the point cloud data to obtain 3-D detection results.
Create Training Data for Object Detection
Training the network on large labeled datasets provides faster and more accurate detection results.
Use the Lidar Labeler app to interactively label point clouds and export label data for training. You can label cuboids, lines, and voxel regions inside a point clouds using the app. You can also add scene labels for point classification. For more information, see Get Started with the Lidar Labeler.
Augment and Preprocess Data
Using data augmentation techniques adds variety to the limited datasets. You can transform point clouds by translating, rotating, and adding new bounding boxes to the point cloud. This provides distinct point clouds for training. For more details, see Data Augmentations for Lidar Object Detection Using Deep Learning.
To convert unorganized point clouds into organized format, use the pcorganize
function. For more information, see the Unorganized to Organized Conversion of Point Clouds Using Spherical Projection example.
When your network input is 2-D, you can use the ImageDatastore
, PixelLabelDatastore
, and boxLabelDatastore
objects to divide and store the training and the test data.
To store point clouds, use the fileDatastore
object.
For aerial lidar data, use the blockedPointCloudDatastore
and blockedPointCloud
functions, respectively to store and process point cloud
data as blocks.
For more information, see
Preprocess Data for Domain-Specific Deep Learning Applications (Deep Learning Toolbox)
Datastores for Deep Learning (Deep Learning Toolbox)
Create Object Detection Network
Define your network based on the network input and the layers. For a list of supported
layers and how to create them, see the List of Deep Learning Layers (Deep Learning Toolbox). To visualize the network
architecture, use the analyzeNetwork
(Deep Learning Toolbox)
function.
You can also design a network layer-by-layer interactively using the Deep Network Designer (Deep Learning Toolbox).
Use the pointPillarsObjectDetector
object, to create a PointPillars object detector
network.
Train Object Detector Network
To specify the training options, use the trainingOptions
(Deep Learning Toolbox) function and you can train the network by using the trainNetwork
(Deep Learning Toolbox) function.
Use the trainPointPillarsObjectDetector
function to train a PointPillars
network.
Detect Objects in Point Clouds Using Deep Learning Detectors and Pretrained Models
Use the detect
function
to detect objects using a PointPillars network.
To evaluate the detection results, use the evaluateObjectDetection
and bboxOverlapRatio
functions.
Lidar Toolbox™ provides these pretrained object detection models for PointPillars and Complex YOLOv4 networks. For more information, see
Code Generation
To learn how to generate CUDA® code for a segmentation workflow, see these examples.
Code Generation for Lidar Object Detection Using SqueezeSegV2 Network
Code Generation for Lidar Object Detection Using PointPillars Deep Learning
See Also
Apps
- Lidar Labeler | Deep Network Designer (Deep Learning Toolbox)
Functions
Objects
Related Examples
- Lidar Object Detection Using Complex-YOLO v4 Network
- Code Generation for Lidar Object Detection Using SqueezeSegV2 Network
- Lidar 3-D Object Detection Using PointPillars Deep Learning
- Code Generation for Lidar Object Detection Using PointPillars Deep Learning
More About
- Deep Learning in MATLAB (Deep Learning Toolbox)
- Getting Started with Point Clouds Using Deep Learning
- Get Started with PointPillars
- Semantic Segmentation in Point Clouds Using Deep Learning
- Datastores for Deep Learning (Deep Learning Toolbox)
- List of Deep Learning Layers (Deep Learning Toolbox)