Main Content

fasterRCNNLayers

Create a faster R-CNN object detection network

Since R2019b

Description

example

lgraph = fasterRCNNLayers(inputImageSize,numClasses,anchorBoxes,network) returns a Faster R-CNN network as a layerGraph (Deep Learning Toolbox) object. A Faster R-CNN network is a convolutional neural network based object detector. The detector predicts the coordinates of bounding boxes, objectness scores, and classification scores for a set of anchor boxes. To train the created network, use the trainFasterRCNNObjectDetector function. For more information, see Getting Started with R-CNN, Fast R-CNN, and Faster R-CNN.

lgraph = fasterRCNNLayers(inputImageSize,numClasses,anchorBoxes,network,featureLayer) returns the object detection network based on the specified featureLayer of the network. Use this syntax when you specify the network as a SeriesNetwork (Deep Learning Toolbox), DAGNetwork (Deep Learning Toolbox), or layerGraph (Deep Learning Toolbox). object.

lgraph = fasterRCNNLayers(___,Name=Value) returns the object detection network with optional input properties specified by one or more name-value arguments.

Using this function requires Deep Learning Toolbox™.

Examples

collapse all

Specify the image size.

inputImageSize = [224 224 3];

Specify the number of objects to detect.

numClasses = 1;

Use a pretrained ResNet-50 network as the base network for the Faster R-CNN network. You must download the resnet50 (Deep Learning Toolbox) support package.

network = "resnet50";

Specify the network layer to use for feature extraction. You can use the analyzeNetwork (Deep Learning Toolbox) function to see all the layer names in a network.

featureLayer = "activation_40_relu";

Specify the anchor boxes. You can also use the estimateAnchorBoxes function to estimate anchor boxes from your training data.

anchorBoxes = [64,64; 128,128; 192,192];

Create the Faster R-CNN object detection network.

lgraph = fasterRCNNLayers(inputImageSize,numClasses,anchorBoxes, ...
                          network,featureLayer)
lgraph = 
  LayerGraph with properties:

     InputNames: {'input_1'}
    OutputNames: {'rcnnClassification'  'boxDeltas'  'rpnBoxDeltas'  'rpnClassification'}
         Layers: [188x1 nnet.cnn.layer.Layer]
    Connections: [205x2 table]

Visualize the network using the network analyzer.

analyzeNetwork(lgraph)                      

Input Arguments

collapse all

Network input image size, specified as a 3-element vector in the format [height, width, depth]. depth is the number of image channels. Set depth to 3 for RGB images, to 1 for grayscale images, or to the number of channels for multispectral and hyperspectral images.

Number of classes for the network to classify, specified as an integer greater than 0.

Anchor boxes, specified as an M-by-2 matrix of M anchor boxes in the format [height, width]. Anchor boxes are determined based on the scale and aspect ratio of objects in the training data set. For example, if an object is localized by a square window, then you can set the size of the anchor boxes to [64 64;128 128].

Pretrained classification network, specified as a SeriesNetwork (Deep Learning Toolbox), DAGNetwork (Deep Learning Toolbox), or layerGraph (Deep Learning Toolbox), or as on of the following:

When you specify the network as a SeriesNetwork (Deep Learning Toolbox) object, a DAGNetwork (Deep Learning Toolbox) object, or by name, the function transforms the network into a Faster R-CNN network. It transforms the network by adding a region proposal network (RPN), and ROI max pooling layer, and new classification and regression layers to support object detection.

Feature extraction layer, specified as a character vector or a string scalar. Use one of the deeper layers in the network you specify. You can use the analyzeNetwork (Deep Learning Toolbox) function to view the names of the layers in the input network.

Note

You can specify any network layer except the fully connected layer as the feature layer.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: Set ROIMaxPoolingLayer="auto" to specify the function to inserts a new ROI max pooling layer after the feature extraction layer when the layer next to the feature extraction layer is not a max pooling layer.

ROI max pooling layer, specified as a "auto", "insert", or "replace". You can specify whether a roiMaxPooling2dLayer replaces the pooling layer or follows the feature extraction layer.

If you select "auto", the function:

  • Inserts a new ROI max pooling layer after the feature extraction layer when the layer next to the feature extraction layer is not a max pooling layer.

  • Replaces the current pooling layer after the feature extraction layer with an ROI max pooling layer.

ROI max pooling layer output size, specified as "auto" or a 2-element vector of positive integers. When you set the value to "auto", the function determines the output size based on the ROIMaxPoolingLayer property. It uses the output size of the feature extraction layer or the pooling layer following the feature extraction layer.

Note

If the input image dimensions specified by the inputImageSize argument are not in a 1-to-1 aspect ratio, you must set the value of ROIOutputSize using dimensions which are compatible with the feature layer output size.

Output Arguments

collapse all

Object detection network, returned as a layerGraph (Deep Learning Toolbox) object. The output and base network imageInputLayer normalization values are equal.

Version History

Introduced in R2019b