Object detection is a computer vision technique for locating instances of objects in images or videos. Object detection algorithms typically leverage machine learning or deep learning to produce meaningful results. When looking at images or video, humans can recognize and locate objects of interest in a matter of moments. The goal of object detection is to replicate this intelligence using a computer. The best approach for object detection depends on your application and the problem you are trying to solve.
Deep learning techniques require a large number of labeled training images, so the use of a GPU is recommended to decrease the time needed to train a model. Deep learning-based approaches to object detection use convolutional neural networks (CNNs or ConvNets), such as R-CNN and YOLO v2, or use single-shot detection (SSD). You can train a custom object detector, or use a pretrained object detector by leveraging transfer learning, an approach that enables you to start with a pretrained network and then fine-tune it for your application. Convolutional neural networks require Deep Learning Toolbox™. Training and prediction are supported on a CUDA®-capable GPU. Use of a GPU is recommended and requires Parallel Computing Toolbox™. For more information, see Computer Vision Toolbox Preferences and Parallel Computing Support in MathWorks Products (Parallel Computing Toolbox).
Machine learning techniques for object detection include aggregate channel features (ACF), support vector machines (SVM) classification using histograms of oriented gradient (HOG) features, and the Viola-Jones algorithm for human face or upper-body detection. You can choose to start with a pretrained object detector or create a custom object detector to suit your application.
Deep Learning Detectors
|Detect objects using R-CNN deep learning detector|
|Detect objects using Fast R-CNN deep learning detector|
|Detect objects using Faster R-CNN deep learning detector|
|Detect objects using SSD deep learning detector|
|Detect objects using YOLO v2 object detector|
|Create YOLO v3 object detector|
|Detect objects using Mask R-CNN instance segmentation|
|Recognize text using optical character recognition|
|Detect and estimate pose for AprilTag in image|
|Detect and decode 1-D or 2-D barcode in image|
|Detect objects using aggregate channel features|
|Detect people using aggregate channel features|
|Detect objects using the Viola-Jones algorithm|
|Foreground detection using Gaussian mixture models|
|Detect upright people using HOG features|
|Properties of connected regions|
Detect Objects Using Point Features
|Detect BRISK features and return |
|Detect corners using FAST algorithm and return |
|Detect corners using Harris–Stephens algorithm
and return |
|Detect KAZE features and return |
|Detect corners using minimum eigenvalue algorithm and
|Detect MSER features and return |
|Detect ORB keypoints and return an |
|Detect scale invariant feature transform (SIFT) features and return |
|Detect SURF features and return |
|Extract interest point descriptors|
|Find matching features|
Select Detected Objects
Train Custom Object Detectors
Load Training Data
|Datastore for bounding box label data|
|Ground truth label data|
|Datastore for image data|
|Create training data for an object detector|
|Combine data from multiple datastores|
Train Feature-Based Object Detectors
|Train ACF object detector|
|Train cascade object detector model|
|Train an image category classifier|
Train Deep Learning Based Object Detectors
|Train an R-CNN deep learning object detector|
|Train a Fast R-CNN deep learning object detector|
|Train a Faster R-CNN deep learning object detector|
|Train an SSD deep learning object detector|
|Train YOLO v2 object detector|
Augment and Preprocess Training Data for Deep Learning
|Balance bounding box labels for object detection|
|Crop bounding boxes|
|Remove bounding boxes|
|Resize bounding boxes|
|Apply geometric transformation to bounding boxes|
|Convert rectangle to corner points list|
|Apply geometric transformation to image|
|Create randomized 2-D affine transformation|
|Create rectangular center cropping window|
|Randomly select rectangular region in image|
|Calculate 2-D integral image|
Design Object Detection Deep Neural Networks
R-CNN (Regions With Convolutional Neural Networks)
|Box regression layer for Fast and Faster R-CNN|
|Create a faster R-CNN object detection network|
|Softmax layer for region proposal network (RPN)|
|Classification layer for region proposal networks (RPNs)|
|Region proposal layer for Faster R-CNN|
|Non-quantized ROI pooling layer for Mask-CNN|
|ROI input layer for Fast R-CNN|
|Neural network layer used to output fixed-size feature maps for rectangular ROIs|
|Non-quantized ROI pooling of |
YOLO (You Only Look Once)
|Create YOLO v2 object detection network|
|Create transform layer for YOLO v2 object detection network|
|Create output layer for YOLO v2 object detection network|
|(Not recommended) Create reorganization layer for YOLO v2 object detection network|
|Space to depth layer|
Focal Loss Layers
|Create focal loss layer using focal loss function|
|Compute focal cross-entropy loss|
SSD (Single Shot Detector)
|Create SSD merge layer for object detection|
|SSD multibox object detection network|
Visualize Detection Results
Evaluate Detection Results
|Evaluate average orientation similarity metric for object detection|
|Evaluate miss rate metric for object detection|
|Evaluate precision metric for object detection|
|Compute bounding box overlap ratio|
|Compute bounding box precision and recall against ground truth|
|Deep Learning Object Detector||Detect objects using trained deep learning object detector|
Object detection using deep learning neural networks.
Choose functions that return and accept points objects for several types of features
Specify pixel Indices, spatial coordinates, and 3-D coordinate systems
Learn the benefits and applications of local feature detection and extraction.
Use the Computer Vision Toolbox™ functions for image category classification by creating a bag of visual words.
Train a custom classifier
Compare visualization functions.
Training Data for Object Detection and Semantic Segmentation
Interactively label rectangular ROIs for object detection, pixels for semantic segmentation, polygons for instance segmentation, and scenes for image classification.
Interactively label rectangular ROIs for object detection, pixels for semantic segmentation, polygons for instance segmentation, and scenes for image classification in a video or image sequence.
Datastores for Deep Learning (Deep Learning Toolbox)
Learn how to use datastores in deep learning applications.
Perform multiclass instance segmentation using Mask R-CNN and deep learning.
Create training data for object detection or semantic segmentation using the Image Labeler or Video Labeler.
Get Started With Deep Learning
Deep Network Designer (Deep Learning Toolbox)
List of Deep Learning Layers (Deep Learning Toolbox)
Discover all the deep learning layers in MATLAB®.
Deep Learning in MATLAB (Deep Learning Toolbox)
Discover deep learning capabilities in MATLAB using convolutional neural networks for classification and regression, including pretrained networks and transfer learning, and training on GPUs, CPUs, clusters, and clouds.
Pretrained Deep Neural Networks (Deep Learning Toolbox)
Learn how to download and use pretrained convolutional neural networks for classification, transfer learning and feature extraction.