Define Custom Deep Learning Layer for Code Generation
If Deep Learning Toolbox™ does not provide the layer you require for your classification or regression problem, then you can define your own custom layer using this example as a guide. For a list of built-in layers, see List of Deep Learning Layers.
To define a custom deep learning layer, you can use the template provided in this example, which takes you through these steps:
Name the layer — Give the layer a name so that you can use it in MATLAB®.
Declare the layer properties — Specify the properties of the layer, including learnable parameters and state parameters.
Create the constructor function (optional) — Specify how to construct the layer and initialize its properties. If you do not specify a constructor function, then at creation, the software initializes the
Name
,Description
, andType
properties with[]
and sets the number of layer inputs and outputs to1
.Create initialize function (optional) — Specify how to initialize the learnable and state parameters when the software initializes the network. If you do not specify an initialize function, then the software does not initialize parameters when it initializes the network.
Create forward functions — Specify how data passes forward through the layer (forward propagation) at prediction time and at training time.
Create reset state function (optional) — Specify how to reset state parameters.
Create a backward function (optional) — Specify the derivatives of the loss with respect to the input data and the learnable parameters (backward propagation). If you do not specify a backward function, then the forward functions must support
dlarray
objects.
You must specify the pragma %#codegen
in the layer definition to create
a custom layer for code generation. Code generation does not support custom layers with
state properties (properties with attribute State
).
In addition, when generating code that uses third-party libraries:
Code generation supports custom layers with 2-D image or feature input only.
The inputs and output of the layer forward functions must have the same batch size.
Nonscalar properties must be a single, double, or character array.
Scalar properties must have type numeric, logical, or string.
This example shows how to create a SReLU layer, which is a layer with four learnable parameters and use it in a convolutional neural network. A SReLU layer performs a thresholding operation, where for each channel, the layer scales values outside an interval. The interval thresholds and scaling factors are learnable parameters. [1].
The SReLU operation is given by
where xi is the input on channel i, tli and tri are the left and right thresholds on channel i, respectively, and ali and ari are the left and right scaling factors on channel i, respectively. These threshold values and scaling factors are learnable parameter, which the layer learns during training.
Custom Layer Template
Copy the custom layer template into a new file in MATLAB. This template gives the structure of a layer class definition. It outlines:
The optional
properties
blocks for the layer properties, learnable parameters, and state parameters.The optional layer constructor function.
The optional
initialize
function.The
predict
function and the optionalforward
function.The optional
resetState
function for layers with state properties.The optional
backward
function.
classdef myLayer < nnet.layer.Layer % ... % & nnet.layer.Formattable ... % (Optional) % & nnet.layer.Acceleratable % (Optional) properties % (Optional) Layer properties. % Declare layer properties here. end properties (Learnable) % (Optional) Layer learnable parameters. % Declare learnable parameters here. end properties (State) % (Optional) Layer state parameters. % Declare state parameters here. end properties (Learnable, State) % (Optional) Nested dlnetwork objects with both learnable % parameters and state parameters. % Declare nested networks with learnable and state parameters here. end methods function layer = myLayer() % (Optional) Create a myLayer. % This function must have the same name as the class. % Define layer constructor function here. end function layer = initialize(layer,layout) % (Optional) Initialize layer learnable and state parameters. % % Inputs: % layer - Layer to initialize % layout - Data layout, specified as a networkDataLayout % object % % Outputs: % layer - Initialized layer % % - For layers with multiple inputs, replace layout with % layout1,...,layoutN, where N is the number of inputs. % Define layer initialization function here. end function [Y,state] = predict(layer,X) % Forward input data through the layer at prediction time and % output the result and updated state. % % Inputs: % layer - Layer to forward propagate through % X - Input data % Outputs: % Y - Output of layer forward function % state - (Optional) Updated layer state % % - For layers with multiple inputs, replace X with X1,...,XN, % where N is the number of inputs. % - For layers with multiple outputs, replace Y with % Y1,...,YM, where M is the number of outputs. % - For layers with multiple state parameters, replace state % with state1,...,stateK, where K is the number of state % parameters. % Define layer predict function here. end function [Y,state,memory] = forward(layer,X) % (Optional) Forward input data through the layer at training % time and output the result, the updated state, and a memory % value. % % Inputs: % layer - Layer to forward propagate through % X - Layer input data % Outputs: % Y - Output of layer forward function % state - (Optional) Updated layer state % memory - (Optional) Memory value for custom backward % function % % - For layers with multiple inputs, replace X with X1,...,XN, % where N is the number of inputs. % - For layers with multiple outputs, replace Y with % Y1,...,YM, where M is the number of outputs. % - For layers with multiple state parameters, replace state % with state1,...,stateK, where K is the number of state % parameters. % Define layer forward function here. end function layer = resetState(layer) % (Optional) Reset layer state. % Define reset state function here. end function [dLdX,dLdW,dLdSin] = backward(layer,X,Y,dLdY,dLdSout,memory) % (Optional) Backward propagate the derivative of the loss % function through the layer. % % Inputs: % layer - Layer to backward propagate through % X - Layer input data % Y - Layer output data % dLdY - Derivative of loss with respect to layer % output % dLdSout - (Optional) Derivative of loss with respect % to state output % memory - Memory value from forward function % Outputs: % dLdX - Derivative of loss with respect to layer input % dLdW - (Optional) Derivative of loss with respect to % learnable parameter % dLdSin - (Optional) Derivative of loss with respect to % state input % % - For layers with state parameters, the backward syntax must % include both dLdSout and dLdSin, or neither. % - For layers with multiple inputs, replace X and dLdX with % X1,...,XN and dLdX1,...,dLdXN, respectively, where N is % the number of inputs. % - For layers with multiple outputs, replace Y and dLdY with % Y1,...,YM and dLdY,...,dLdYM, respectively, where M is the % number of outputs. % - For layers with multiple learnable parameters, replace % dLdW with dLdW1,...,dLdWP, where P is the number of % learnable parameters. % - For layers with multiple state parameters, replace dLdSin % and dLdSout with dLdSin1,...,dLdSinK and % dLdSout1,...,dldSoutK, respectively, where K is the number % of state parameters. % Define layer backward function here. end end end
Name Layer and Specify Superclasses
First, give the layer a name. In the first line of the class file, replace the
existing name myLayer
with codegenSReLULayer
and
add a comment describing the layer.
The layer functions support acceleration, so also inherit from
nnet.layer.Acceleratable
. For more information about accelerating
custom layer functions, see Custom Layer Function Acceleration. The layer does not
require formattable inputs, so remove the optional
nnet.layer.Formattable
superclass.
classdef codegenSReLULayer < nnet.layer.Layer ... & nnet.layer.Acceleratable % Example custom SReLU layer with codegen support. ... end
Next, rename the myLayer
constructor function (the first function
in the methods
section) so that it has the same name as the
layer.
methods function layer = codegenSReLULayer() ... end ... end
Save Layer
Save the layer class file in a new file named
codegenSReLULayer.m
. The file name must match the layer name.
To use the layer, you must save the file in the current folder or in a folder on the
MATLAB path.
Specify Code Generation Pragma
Add the %#codegen
directive (or pragma) to your layer definition to
indicate that you intend to generate code for this layer. Adding this directive instructs
the MATLAB Code Analyzer to help you diagnose and fix violations that result in errors
during code generation.
classdef codegenSReLULayer < nnet.layer.Layer ... & nnet.layer.Acceleratable % Example custom SReLU layer with codegen support. %#codegen ... end
Declare Properties and Learnable Parameters
Declare the layer properties in the properties
section and declare
learnable parameters by listing them in the properties (Learnable)
section.
By default, custom layers have these properties. Do not declare these properties in the
properties
section.
Property | Description |
---|---|
Name | Layer name, specified as a character vector or string scalar.
For Layer array input, the trainnet and
dlnetwork functions automatically assign
names to layers with the name "" . |
Description | One-line description of the layer, specified as a string scalar or a character vector. This
description appears when the layer is displayed in a If you do not specify a layer description, then the software displays the layer class name. |
Type | Type of the layer, specified as a character vector or a string scalar. The value of If you do not specify a layer type, then the software displays the layer class name. |
NumInputs | Number of inputs of the layer, specified as a positive integer. If
you do not specify this value, then the software automatically sets
NumInputs to the number of names in
InputNames . The default value is 1. |
InputNames | Input names of the layer, specified as a cell array of character
vectors. If you do not specify this value and
NumInputs is greater than 1, then the software
automatically sets InputNames to
{'in1',...,'inN'} , where N is
equal to NumInputs . The default value is
{'in'} . |
NumOutputs | Number of outputs of the layer, specified as a positive integer. If
you do not specify this value, then the software automatically sets
NumOutputs to the number of names in
OutputNames . The default value is 1. |
OutputNames | Output names of the layer, specified as a cell array of character
vectors. If you do not specify this value and
NumOutputs is greater than 1, then the software
automatically sets OutputNames to
{'out1',...,'outM'} , where M
is equal to NumOutputs . The default value is
{'out'} . |
If the layer has no other properties, then you can omit the properties
section.
Tip
If you are creating a layer with multiple inputs, then you must
set either the NumInputs
or InputNames
properties in the
layer constructor. If you are creating a layer with multiple outputs, then you must set either
the NumOutputs
or OutputNames
properties in the layer
constructor. For an example, see Define Custom Deep Learning Layer with Multiple Inputs.
To support code generation:
Nonscalar properties must have type single, double, or character array.
Scalar properties must be numeric or have type logical or string.
A SReLU layer does not require any additional properties, so you can remove the
properties
section.
A SReLU layer has four learnable parameters: the left and right threshold and scaling
factors, respectively. Declare this learnable parameter in the properties
(Learnable)
section and call the parameter
Alpha
.
properties (Learnable)
% Layer learnable parameters
LeftSlope
RightSlope
LeftThreshold
RightThreshold
end
Create Constructor Function
Create the function that constructs the layer and initializes the layer properties. Specify any variables required to create the layer as inputs to the constructor function.
The SReLU layer constructor function requires one optional argument (the layer name).
Specify one input argument named name
in the
sreluLayer
function that corresponds to the optional argument.
Add a comment to the top of the function that explains the syntax of the
function.
function layer = codegenSReLULayer(name) % layer = codegenSReLULayer creates a SReLU layer. % layer = codegenSReLULayer(name) also specifies the layer % name. ... end
Initialize Layer Properties
Initialize the layer properties, including learnable parameters, in the
constructor function. Replace the comment % Layer constructor function goes
here
with code that initializes the layer properties.
Set the Name
property to the input argument
name
.
% Set layer name.
layer.Name = name;
Give the layer a one-line description by setting the
Description
property of the layer. Set the description to
describe the type of layer.
% Set layer description. layer.Description = "SReLU";
View the completed constructor function.
function layer = codegenSReLULayer(args)
% layer = codegenSReLULayer creates a SReLU layer.
% layer = codegenSReLULayer(name) also specifies the layer
% name.
arguments nargin == 0
args.Name = ""
end
% Set layer name.
layer.Name = args.Name;
% Set layer description.
layer.Description = "SReLU";
end
With this constructor function, the command
codegenSreluLayer("srelu")
creates a SReLU layer with the
name "srelu"
.
Create Initialize Function
Create the function that initializes the layer learnable and state parameters when the software initializes the network. Ensure that the function only initializes learnable and state parameters when the property is empty, otherwise the software can overwrite when you load the network from a MAT file.
To initialize the learnable parameters, generate a random vectors with the same number of channels as the input data.
Because the size of the input data is unknown until the network is ready to use, you must create an initialize function that initializes the learnable and state parameters using networkDataLayout
objects that the software provides to the function. Network data layout objects contain information about the sizes and formats of expected input data. Create an initialize function that uses the size and format information to initialize learnable and state parameters such that they have the correct size.
The learnable parameters have the same number of dimensions as the input observations,
where the channel dimension has the same size as the channel dimension of the input
data, and the remaining dimensions are singleton. Create an
initialize
function that extracts the size and format information
from the input networkDataLayout
object and initializes the learnable
parameters with the same number of channels.
function layer = initialize(layer,layout)
% layer = initialize(layer,layout) initializes the layer
% learnable parameters using the specified input layout.
% Find number of channels.
idx = finddim(layout,"C");
numChannels = layout.Size(idx);
% Initialize empty learnable parameters.
sz = ones(1,numel(layout.Size);
sz(idx) = numChannels;
if isempty(layer.LeftSlope)
layer.LeftSlope = rand(sz);
end
if isempty(layer.RightSlope)
layer.RightSlope = rand(sz);
end
if isempty(layer.LeftThreshold)
layer.LeftThreshold = rand(sz);
end
if isempty(layer.RightThreshold)
layer.RightThreshold = rand(sz);
end
end
Create Forward Functions
Create the layer forward functions to use at prediction time and training time.
Create a function named predict
that propagates the data forward
through the layer at prediction time and outputs the result.
The predict
function syntax depends on the type of layer.
Y = predict(layer,X)
forwards the input dataX
through the layer and outputs the resultY
, wherelayer
has a single input and a single output.[Y,state] = predict(layer,X)
also outputs the updated state parameterstate
, wherelayer
has a single state parameter.
You can adjust the syntaxes for layers with multiple inputs, multiple outputs, or multiple state parameters:
For layers with multiple inputs, replace
X
withX1,...,XN
, whereN
is the number of inputs. TheNumInputs
property must matchN
.For layers with multiple outputs, replace
Y
withY1,...,YM
, whereM
is the number of outputs. TheNumOutputs
property must matchM
.For layers with multiple state parameters, replace
state
withstate1,...,stateK
, whereK
is the number of state parameters.
Tip
If the number of inputs to the layer can vary, then use varargin
instead of X1,…,XN
. In this case, varargin
is a cell array of the inputs, where varargin{i}
corresponds to Xi
.
If the number of outputs can vary, then use varargout
instead of Y1,…,YM
. In this case, varargout
is a cell array of the outputs, where varargout{j}
corresponds to Yj
.
Because a SReLU layer has only one input and one output, the syntax for
predict
for a SReLU layer is Y =
predict(layer,X)
.
For code generation support, all the layer inputs must have the same number of dimensions and batch size.
By default, the layer uses predict
as the forward function at
training time. To use a different forward function at training time, or retain a value
required for a custom backward function, you must also create a function named
forward
. The software does not generate code for the
forward
function but it must be code generation
compatible.
The forward
function propagates the data forward through the layer
at training time and also outputs a memory value.
The forward
function syntax depends on the type of layer:
Y = forward(layer,X)
forwards the input dataX
through the layer and outputs the resultY
, wherelayer
has a single input and a single output.[Y,state] = forward(layer,X)
also outputs the updated state parameterstate
, wherelayer
has a single state parameter.[__,memory] = forward(layer,X)
also returns a memory value for a custombackward
function using any of the previous syntaxes. If the layer has both a customforward
function and a custombackward
function, then the forward function must return a memory value.
You can adjust the syntaxes for layers with multiple inputs, multiple outputs, or multiple state parameters:
For layers with multiple inputs, replace
X
withX1,...,XN
, whereN
is the number of inputs. TheNumInputs
property must matchN
.For layers with multiple outputs, replace
Y
withY1,...,YM
, whereM
is the number of outputs. TheNumOutputs
property must matchM
.For layers with multiple state parameters, replace
state
withstate1,...,stateK
, whereK
is the number of state parameters.
Tip
If the number of inputs to the layer can vary, then use varargin
instead of X1,…,XN
. In this case, varargin
is a cell array of the inputs, where varargin{i}
corresponds to Xi
.
If the number of outputs can vary, then use varargout
instead of Y1,…,YM
. In this case, varargout
is a cell array of the outputs, where varargout{j}
corresponds to Yj
.
The SReLU operation is given by
where xi is the input on channel i, tli and tri are the left and right thresholds on channel i, respectively, and ali and ari are the left and right scaling factors on channel i, respectively. These threshold values and scaling factors are learnable parameter, which the layer learns during training.
Implement this operation in predict
. In predict
,
the input X
corresponds to x in the equation. The
output Y
corresponds to .
Add a comment to the top of the function that explains the syntaxes of the function.
Tip
If you preallocate arrays using functions such as
zeros
, then you must ensure that the data types of these arrays are
consistent with the layer function inputs. To create an array of zeros of the same data type as
another array, use the "like"
option of zeros
. For
example, to initialize an array of zeros of size sz
with the same data type
as the array X
, use Y = zeros(sz,"like",X)
.
Implementing the backward
function is optional when the forward
functions fully support dlarray
input. For code generation support, the
predict
function must also support numeric input.
function Y = predict(layer, X)
% Y = predict(layer, X) forwards the input data X through the
% layer and outputs the result Y.
tl = layer.LeftThreshold;
al = layer.LeftSlope;
tr = layer.RightThreshold;
ar = layer.RightSlope;
Y = (X <= tl) .* (tl + al.*(X-tl)) ...
+ ((tl < X) & (X < tr)) .* X ...
+ (tr <= X) .* (tr + ar.*(X-tr));
end
Because the predict
function fully supports
dlarray
objects, defining the backward
function
is optional. For a list of functions that support dlarray
objects, see
List of Functions with dlarray Support.
Completed Layer
View the completed layer class file.
classdef codegenSReLULayer < nnet.layer.Layer ... & nnet.layer.Acceleratable ... % Example custom SReLU layer with codegen support. %#codegen properties (Learnable) % Layer learnable parameters LeftSlope RightSlope LeftThreshold RightThreshold end methods function layer = codegenSReLULayer(args) % layer = codegenSReLULayer creates a SReLU layer. % layer = codegenSReLULayer(name) also specifies the layer % name. arguments nargin == 0 args.Name = "" end % Set layer name. layer.Name = args.Name; % Set layer description. layer.Description = "SReLU"; end function layer = initialize(layer,layout) % layer = initialize(layer,layout) initializes the layer % learnable parameters using the specified input layout. % Find number of channels. idx = finddim(layout,"C"); numChannels = layout.Size(idx); % Initialize empty learnable parameters. sz = ones(1,numel(layout.Size); sz(idx) = numChannels; if isempty(layer.LeftSlope) layer.LeftSlope = rand(sz); end if isempty(layer.RightSlope) layer.RightSlope = rand(sz); end if isempty(layer.LeftThreshold) layer.LeftThreshold = rand(sz); end if isempty(layer.RightThreshold) layer.RightThreshold = rand(sz); end end function Y = predict(layer, X) % Y = predict(layer, X) forwards the input data X through the % layer and outputs the result Y. tl = layer.LeftThreshold; al = layer.LeftSlope; tr = layer.RightThreshold; ar = layer.RightSlope; Y = (X <= tl) .* (tl + al.*(X-tl)) ... + ((tl < X) & (X < tr)) .* X ... + (tr <= X) .* (tr + ar.*(X-tr)); end end end
Check Custom Layer for Code Generation Compatibility
Check the code generation compatibility of the custom layer codegenSReLULayer
.
The custom layer codegenSReLULayer
, attached to this is example as a supporting file, applies the SReLU operation to the input data. To access this layer, open this example as a live script.
Create an instance of the layer.
layer = codegenSReLULayer;
Create a networkDataLayout
object that specifies the expected input size and format of typical input to the layer. Specify a valid input size of [24 24 20 128]
, where the dimensions correspond to the height, width, number of channels, and number of observations of the previous layer output. Specify the format as "SSCB"
(spatial, spatial, channel, batch).
validInputSize = [24 24 20 128];
layout = networkDataLayout(validInputSize,"SSCB");
Check the layer validity using checkLayer. To check for code generation compatibility, set the CheckCodegenCompatibility
option to true
. The checkLayer
function does not check that the layer uses MATLAB functions that are compatible with code generation. To check that the custom layer definition is supported for code generation, first use the Code Generation Readiness app. For more information, see Check Code by Using the Code Generation Readiness Tool (MATLAB Coder).
checkLayer(layer,layout,CheckCodegenCompatibility=true)
Skipping GPU tests. No compatible GPU device found. Running nnet.checklayer.TestLayerWithoutBackward .......... .......... ..... Done nnet.checklayer.TestLayerWithoutBackward __________ Test Summary: 25 Passed, 0 Failed, 0 Incomplete, 9 Skipped. Time elapsed: 1.1221 seconds.
The function does not detect any issues with the layer.
References
[1] Hu, Xiaobin, Peifeng Niu, Jianmei Wang, and Xinxin Zhang. “A Dynamic Rectified Linear Activation Units.” IEEE Access 7 (2019): 180409–16. https://doi.org/10.1109/ACCESS.2019.2959036.
See Also
trainnet
| trainingOptions
| dlnetwork
| functionLayer
| checkLayer
| setLearnRateFactor
| setL2Factor
| getLearnRateFactor
| getL2Factor
| findPlaceholderLayers
| replaceLayer
| PlaceholderLayer
Related Topics
- Code Generation for Deep Learning Networks
- Code Generation for Object Detection Using YOLO v3 Deep Learning Network
- Define Custom Deep Learning Layers
- Define Custom Deep Learning Layer with Learnable Parameters
- Define Custom Deep Learning Layer with Multiple Inputs
- Define Custom Deep Learning Layer with Formatted Inputs
- Define Custom Recurrent Deep Learning Layer
- Define Nested Deep Learning Layer Using Network Composition
- Check Custom Layer Validity