Detect objects using YOLO v2 object detector
When using this function, use of a CUDA®-enabled NVIDIA® GPU with a compute capability of 3.0 or higher is highly recommended. The GPU reduces computation time significantly. Usage of the GPU requires Parallel Computing Toolbox™.
[___] = detect(___,
detects objects within the rectangular search region specified by
roi. Use output arguments from any of the previous syntaxes. Specify
input arguments from any of the previous syntaxes.
[___] = detect(___,
specifies options using one or more
Name,Value pair arguments in
addition to the input arguments in any of the preceding syntaxes.
Load a YOLO v2 object detector pretrained to detect vehicles.
vehicleDetector = load('yolov2VehicleDetector.mat','detector'); detector = vehicleDetector.detector;
Read a test image into the workspace.
I = imread('highway.png');
Display the input test image.
Run the pretrained YOLO v2 object detector on the test image. Inspect the results for vehicle detection. The labels are derived from the
ClassNames property of the detector.
[bboxes,scores,labels] = detect(detector,I)
bboxes = 1×4 78 81 64 63
scores = single 0.6224
labels = categorical vehicle
Annotate the image with the bounding boxes for the detections.
if ~isempty(bboxes) detectedI = insertObjectAnnotation(I,'rectangle',bboxes,cellstr(labels)); end figure imshow(detectedI)
I— Test image
Test image, specified as a real, nonsparse, grayscale, or RGB image.
Datastore, specified as a datastore object containing a collection of images. Each image must be a grayscale, RGB, or multichannel image. The function processes only the first column of the datastore, which must contain images and must be cell arrays or tables with multiple columns.
roi— Search region of interest
Search region of interest, specified as a four-element vector of form [x y width height]. The vector specifies the upper left corner and size of a region of interest in pixels.
comma-separated pairs of
the argument name and
Value is the corresponding value.
Name must appear inside quotes. You can specify several name and value
pair arguments in any order as
'Threshold'— Detection threshold
0.5(default) | scalar in the range [0, 1]
Detection threshold, specified as a comma-separated pair consisting of
'Threshold' and a scalar in the range [0, 1]. Detections that
have scores less than this threshold value are removed. To reduce false positives,
increase this value.
'SelectStrongest'— Select strongest bounding box
Select the strongest bounding box for each detected object, specified as the
comma-separated pair consisting of
'SelectStrongest' and either
true — Returns the strongest bounding box per object. The
method calls the
selectStrongestBboxMulticlass function, which uses nonmaximal
suppression to eliminate overlapping bounding boxes based on their confidence
By default, the
selectStrongestBboxMulticlass function is called as
selectStrongestBboxMulticlass(bbox,scores,... 'RatioType','Union',... 'OverlapThreshold',0.5);
false — Return all the detected bounding boxes. You can
then write your own custom method to eliminate overlapping bounding boxes.
'MinSize'— Minimum region size
[1 1](default) | vector of the form [height width]
Minimum region size, specified as the comma-separated pair consisting of
'MinSize' and a vector of the form [height
width]. Units are in pixels. The minimum region size defines the
size of the smallest region containing the object.
MinSize is 1-by-1.
'MaxSize'— Maximum region size
I) (default) | vector of the form [height width]
Maximum region size, specified as the comma-separated pair consisting of
'MaxSize' and a vector of the form [height
width]. Units are in pixels. The maximum region size defines the
size of the largest region containing the object.
'MaxSize' is set to the height and width of the
I. To reduce computation time, set this value to the
known maximum region size for the objects that can be detected in the input test
'MiniBatchSize'— Minimum batch size
128(default) | scalar
Minimum batch size, specified as the comma-separated pair consisting of
'MiniBatchSize' and a scalar value. Use the
MiniBatchSize to process a large collection of image. Images
are grouped into minibatches and processed as a batch to improve computation
efficiency. Increase the minibatch size to decrease processing time. Decrease the size
to use less memory.
'ExecutionEnvironment'— Hardware resource
Hardware resource on which to run the detector, specified as the comma-separated
pair consisting of
'auto' — Use a GPU if it is available. Otherwise, use the
'gpu' — Use the GPU. To use a GPU, you must have
Computing Toolbox and a CUDA-enabled NVIDIA GPU with a compute capability of 3.0 or higher. If a suitable GPU
is not available, the function returns an error.
'cpu' — Use the CPU.
'Acceleration'— Performance optimization
Performance optimization, specified as the comma-separated pair consisting of
'Acceleration' and one of the following:
'auto' — Automatically apply a number of optimizations
suitable for the input network and hardware resource.
'mex' — Compile and execute a MEX function. This option
is available when using a GPU only. Using a GPU requires Parallel
Computing Toolbox and a CUDA enabled NVIDIA GPU with compute capability 3.0 or higher. If
Computing Toolbox or a suitable GPU is not available, then the function returns an
'none' — Disable all acceleration.
The default option is
specified, MATLAB® applies a number of compatible optimizations. If you use the
'auto' option, MATLAB does not ever generate a MEX function.
'mex' can offer performance benefits, but at the expense of an
increased initial run time. Subsequent calls with compatible parameters are faster.
Use performance optimization when you plan to call the function multiple times using
new input data.
'mex' option generates and executes a MEX function based on
the network and parameters used in the function call. You can have several MEX
functions associated with a single network at one time. Clearing the network variable
also clears any MEX functions associated with that network.
'mex' option is only available for input data specified as
a numeric array, cell array of numeric arrays, table, or image datastore. No other
types of datastore support the
'mex' option is only available when you are using a GPU.
You must also have a C/C++ compiler installed. For setup instructions, see MEX Setup (GPU Coder).
'mex' acceleration does not support all layers. For a list of
supported layers, see Supported Layers (GPU Coder).
bboxes— Location of objects detected within image
Location of objects detected within the input image, returned as an
M-by-4 matrix, where M is the number
of bounding boxes. Each row of
bboxes contains a
four-element vector of the form [x
height]. This vector specifies the upper left corner and size
of that corresponding bounding box in pixels.
scores— Detection scores
Detection confidence scores, returned as an M-by-1 vector, where M is the number of bounding boxes. A higher score indicates higher confidence in the detection.
labels— Labels for bounding boxes
Labels for bounding boxes, returned as an M-by-1
categorical array of M labels. You define the class names
used to label the objects when you train the input
detectionResults— Detection results
Detection results, returned as a 3-column table with variable names, Boxes, Scores, and Labels. The Boxes column contains M-by-4 matrices, of M bounding boxes for the objects found in the image. Each row contains a bounding box as a 4-element vector in the format [x,y,width,height]. The format specifies the upper-left corner location and size in pixels of the bounding box in the corresponding image.
By default, the
detect function preprocesses
the test image for object detection by:
Resizing it to a nearest possible image size used for training the YOLO v2
network. The function determines the nearest possible image size from the
TrainingImageSize property of the
Normalizing its pixel values to lie in same range as that of the images used to
train the YOLO v2 object detector. For example, if the detector was trained on
uint8 images, the test image must also have pixel values in the
range [0, 255]. Otherwise, use the
function to rescale the pixel values in the test image.