slowFastVideoClassifier

SlowFast video classifier. Requires Computer Vision Toolbox Model for SlowFast Video Classification

Since R2021b

Description

Add-On Required: This feature requires the Computer Vision Toolbox Model for SlowFast Video Classification add-on.

The slowFastVideoClassifier object is a SlowFast video classifier pretrained on the Kinetics-400 data set with a ResNet-50 3-D convolutional neural network (CNN). You can use the pretrained video classifier to classify 400 human actions such as running, walking, and shaking hands.

Creation

Syntax

sf = slowFastVideoClassifier

sf = slowFastVideoClassifier("resnet50-3d",classes)

sf = slowFastVideoClassifier(___,Name=Value)

Description

sf = slowFastVideoClassifier returns a SlowFast video classifier pretrained on the Kinetics-400 data set.

sf = slowFastVideoClassifier("resnet50-3d",classes) configures the pretrained SlowFast video classifier for transfer learning on a new set of classes, classes.

example

sf = slowFastVideoClassifier(___,Name=Value) sets properties using name-value arguments in addition to the input arguments from the previous syntax. For example, sf = slowFastVideoClassifier("resnet50-3d",classes,InputSize=[256,256,3,32]) sets the input size of the network. You can specify multiple name-value arguments.

Note

This function requires the Computer Vision Toolbox™ Model for SlowFast Video Classification. You can install Computer Vision Toolbox Model for SlowFast Video Classification from Add-On Explorer. For more information about installing add-ons, see Get and Manage Add-Ons. To use this object, you must have a license for the Deep Learning Toolbox™.

Properties

expand all

Configure Classifier Properties

`InputSize` — Size of the network
Read-only: `[256,256,3,32]` (default) | four-element row vector

This property is read-only.

Size of the video classifier network, specified as a four-element row vector in the form [H,W,C,T], where H and W represent the height and width respectively, C represents the number of channels, and T represents the number of frames for the video subnetwork.

Typical values for the number of frames are 8, 16, 32, or 64. Increase the number of frames to capture the temporal nature of activities when training the classifier.

`InputNormalizationStatistics` — Normalization statistics for the video data
Read-only: `structure` (default)

This property is read-only.

Normalization statistics for the video data, specified as a structure with field names Min, Max, Mean, and StandardDeviation. The Min and Max field values define the minimum and maximum values for rescaling the video data. The Mean, and StandardDeviation values define the mean and standard deviation for input normalization. All field values must be specified as a row vector of size equal to the number of channels for the video input data.

The default structure contains the fields, Min, Max, Mean and StandardDeviation with values [0,0,0], [255,255,255], [0.45,0.45,0.45], and [0.225,0.225,0.225], respectively. You must calculate the statistics values from the dataset for which you are training the video classifier. To rescale the data using minimum and maximum values precomputed from your dataset, specify both Min and Max. Otherwise, the minimum and maximum values are calculated from each input sequence when using updateSequence or classifyVideoFile.

Note

The object normalizes the data by rescaling it between 0 and 1, and then the rescaled data is standardized by subtracting the mean and dividing by the standard deviation. The rescaled data is standardized if the Mean and StandardDeviation fields are non-empty. The input is automatically normalized when using updateSequence or classifyVideoFile object functions. The data must be manually normalized when using the forward or predict object functions.

`ModelName` — Name of trained video classifier
string scalar

Name of the trained video classifier, specified as a string scalar.

`Classes` — Classes that the video classifier is configured to train or classify
Read-only: vector of strings | cell array of character vectors

This property is read-only.

Classes that the video classifier is configured to train or classify, specified as a vector of strings or a cell array of character vectors. For example:

classes = ['kiss','laugh','pick','pour','pushup'];

Training Properties

`Learnables` — Learnable parameters for the SlowFast video classifier
table with three columns

Learnable parameters for the SlowFast video classifier, specified as a table with three columns.

Layer — Layer name, specified as a string scalar.
Parameter — Parameter name, specified as a string scalar.
Value — Parameter value, specified as a dlarray (Deep Learning Toolbox) object.

The network state contains information remembered by the network between iterations. For example, the state of long short term networks (LSTM) and batch normalization layers. During training or inference, you can update the network state using the output of the forward and predict object functions.

`State` — State of the nonlearnable parameters of the SlowFast video classifier
table with three columns

State of the nonlearnable parameters of the SlowFast video classifier, specified as a table with three columns.

Layer — Layer name, specified as a string scalar.
Parameter — Parameter name, specified as a string scalar.
Value — Parameter value, specified as a dlarray (Deep Learning Toolbox) object.

The network learnable parameters contain the features learned by the network. For example, the weights of convolution and fully connected layers.

Streaming Video Classification Properties

`VideoSequence` — Video sequence used for streaming classification
Read-only: 4-D numeric array

This property is read-only.

Video sequence used to update and classify sequences for streaming classification, specified as a 4-D numeric array. Each vector in the array is of the form [H,W,C,T], where H and W represent the height and width respectively, C represents the number of channels, and T represents the number of frames, for the video subnetwork. The updateSequence and classifySequence object functions use the video sequence specified by the VideoSequence property.

Object Functions

expand all

Video Classification

`classifyVideoFile`	Classify a video file
`classifySequence`	Classify video sequence
`resetSequence`	Reset video sequence properties for streaming video classification
`updateSequence`	Update video sequence for classification

Custom Training and Inference

`forward`	Compute video classifier outputs for training
`predict`	Compute video classifier predictions

Examples

collapse all

Classify Video File Using Video Classifier

This example uses:

Open Live Script

This example requires the Computer Vision Toolbox™ Model for SlowFast Video Classification. You can install the Computer Vision Toolbox Model for SlowFast Video Classification from Add-On Explorer. For more information about installing add-ons, see Get and Manage Add-Ons.

Load a slowfast video classifier pretrained on the Kinetics-400 data set.

sf = slowFastVideoClassifier;

Specify the file name of the video to classify.

videoFilename = "washingHands.avi";

For video classification, set the number of randomly selected video sequences to 15.

numSequences = 15;

Classify the video using the classifyVideoFile function.

[label,score] = classifyVideoFile(sf,videoFilename,NumSequences=numSequences)

label = categorical
     washing hands

score = single

0.0034

Display the classified label using a vision.VideoPlayer.

player = vision.VideoPlayer('Name','Washing Hands');
reader = VideoReader(videoFilename);
while hasFrame(reader)    
    frame = readFrame(reader);
    % Resize the frame by 1.5 times for display
    frame = imresize(frame,1.5);
    frame = insertText(frame,[2,2], string(label),'FontSize',18);
    step(player,frame);
end

Version History

Introduced in R2021b

slowFastVideoClassifier

Description

Creation

Syntax

Description

Properties

Configure Classifier Properties

`InputSize` — Size of the network
Read-only: `[256,256,3,32]` (default) | four-element row vector

`InputNormalizationStatistics` — Normalization statistics for the video data
Read-only: `structure` (default)

`ModelName` — Name of trained video classifier
string scalar

`Classes` — Classes that the video classifier is configured to train or classify
Read-only: vector of strings | cell array of character vectors

Training Properties

`Learnables` — Learnable parameters for the SlowFast video classifier
table with three columns

`State` — State of the nonlearnable parameters of the SlowFast video classifier
table with three columns

Streaming Video Classification Properties

`VideoSequence` — Video sequence used for streaming classification
Read-only: 4-D numeric array

Object Functions

Video Classification

Custom Training and Inference

Examples

Classify Video File Using Video Classifier

Version History

See Also

Apps

Functions

Objects

Topics

slowFastVideoClassifier

Description

Creation

Syntax

Description

Properties

Configure Classifier Properties

InputSize — Size of the network Read-only: [256,256,3,32] (default) | four-element row vector

InputNormalizationStatistics — Normalization statistics for the video data Read-only: structure (default)

ModelName — Name of trained video classifier string scalar

Classes — Classes that the video classifier is configured to train or classify Read-only: vector of strings | cell array of character vectors

Training Properties

Learnables — Learnable parameters for the SlowFast video classifier table with three columns

State — State of the nonlearnable parameters of the SlowFast video classifier table with three columns

Streaming Video Classification Properties

VideoSequence — Video sequence used for streaming classification Read-only: 4-D numeric array

Object Functions

Video Classification

Custom Training and Inference

Examples

Classify Video File Using Video Classifier

Version History

See Also

Apps

Functions

Objects

Topics

`InputSize` — Size of the network
Read-only: `[256,256,3,32]` (default) | four-element row vector

`InputNormalizationStatistics` — Normalization statistics for the video data
Read-only: `structure` (default)

`ModelName` — Name of trained video classifier
string scalar

`Classes` — Classes that the video classifier is configured to train or classify
Read-only: vector of strings | cell array of character vectors

`Learnables` — Learnable parameters for the SlowFast video classifier
table with three columns

`State` — State of the nonlearnable parameters of the SlowFast video classifier
table with three columns

`VideoSequence` — Video sequence used for streaming classification
Read-only: 4-D numeric array