Object detection using deep learning provides a fast and accurate means to predict the location of an object in an image. Deep learning is a powerful machine learning technique in which the object detector automatically learns image features required for detection tasks. Several techniques for object detection using deep learning are available such as Faster R-CNN, you only look once (YOLO) v2, YOLO v3, and single shot detection (SSD).
Applications for object detection include:
Use a labeling app to interactively label ground truth data in a video, image sequence, image collection, or custom data source. You can label object detection ground truth using rectangle labels, which define the position and size of the object in the image.
Using data augmentation provides a way to use limited data sets for training. Minor
changes, such as translation, cropping, or transforming an image, provide, new, distinct,
and unique images that you can use to train a robust detector. Datastores are a convenient
way to read and augment collections of data. Use
imageDatastore and the
to create datastores for images and labeled bounding box data.
Augment Bounding Boxes for Object Detection (Deep Learning Toolbox)
Preprocess Images for Deep Learning (Deep Learning Toolbox)
Preprocess Data for Domain-Specific Deep Learning Applications (Deep Learning Toolbox)
For more information about augmenting training data using datastores, see Datastores for Deep Learning (Deep Learning Toolbox), and Perform Additional Image Processing Operations Using Built-In Datastores (Deep Learning Toolbox).
Each object detector contains a unique network architecture. For example, the Faster R-CNN
detector uses a two-stage network for detection, whereas the YOLO v2 detector uses a single
stage. Use functions like
create a network. You can also design a network layer by layer using the Deep Network Designer (Deep Learning Toolbox).
trainSSDObjectDetector functions to train an object detector. Use the
evaluateDetectionPrecision functions to evaluate the training results.
Detect objects in an image using the trained detector. For example, the partial code shown
below uses the trained
detector on an image
I. Use the
detect object function on
objects to return bounding boxes, detection scores, and categorical labels assigned to the
I = imread(input_image) [bboxes,scores,labels] = detect(detector,I)
MathWorks® GitHub repository provides implementations of the latest pretrained object detection deep learning networks to download and use for performing out-of-the-box inference. The pretrained object detection networks are already trained on standard data sets such as the COCO and Pascal VOC data sets. You can use these pretrained models directly to detect different objects in a test image.
To perform object detection by using the pretrained you-only-look-once (YOLO) v2 and v4 deep learning networks, see Object Detection Using Pretrained YOLO v2 Deep Learning Network and Object Detection Using Pretrained YOLO v4 Deep Learning Network, respectively.
To perform scan text detection by using a pretrained deep learning network, see Pretrained Character Region Awareness For Text Detection Model. You can use this pretrained model to detect texts in images. The model can detect text in these seven languages: English, Korean, Italian, French, Arabic, German, and Bangla.
For a list of all the latest MathWorks pretrained models and examples, see MATLAB Deep Learning (GitHub).