Code Generation for a Video Classification Network
This example shows how to generate CUDA® code for a deep learning network that classifies video and deploy the generated code onto the NVIDIA® Jetson™ Xavier board by using the MATLAB® Coder™ Support Package for NVIDIA Jetson and NVIDIA DRIVE® Platforms. The deep learning network has both convolutional and bidirectional long short-term memory (BiLSTM) layers. The generated application reads the data from a specified video file as a sequence of video frames and outputs a label that classifies the activity in the video. This example generates code for the network trained in the Classify Videos Using Deep Learning example from the Deep Learning Toolbox™. For more information, see Classify Videos Using Deep Learning (Deep Learning Toolbox).
Third-Party Prerequisites
Target Board Requirements
NVIDIA Jetson board.
Ethernet crossover cable to connect the target board and host PC (if the target board cannot be connected to a local network).
Supported Jetpack SDK that includes CUDA and cuDNN libraries
Environment variables on the target for the compilers and libraries. For information on the supported versions of the compilers and libraries and their setup, see Install and Setup Prerequisites for NVIDIA Boards for NVIDIA boards.
Verify NVIDIA Support Package Installation on Host
To generate and deploy code to an NVIDIA Jetson Xavier board, you will need the MATLAB Coder Support Package for NVIDIA Jetson and NVIDIA DRIVE Platforms. Use the checkHardwareSupportPackageInstall
function to verify that the host system is compatible to run this example. If the function does not throw an error, the support package is correctly installed.
checkHardwareSupportPackageInstall();
Connect to the NVIDIA Hardware
The support package uses an SSH connection over TCP/IP to execute commands while building and running the generated CUDA code on the Jetson platform. You must therefore connect the target platform to the same network as the host computer or use an Ethernet crossover cable to connect the board directly to the host computer. Refer to the NVIDIA documentation on how to set up and configure your board.
To communicate with the NVIDIA hardware, you must create a live hardware connection object by using the jetson
function. You must know the host name or IP address, username, and password of the target board to create a live hardware connection object. For example, when connecting to the target board for the first time, create a live object for Jetson hardware by using the command:
hwobj = jetson('jetson-name','ubuntu','ubuntu');
The jetson object reuses these settings from the most recent successful connection to the Jetson hardware. This example establishes an SSH connection to the Jetson hardware using the settings stored in memory.
hwobj = jetson;
Checking for CUDA availability on the Target... Checking for 'nvcc' in the target system path... Checking for cuDNN library availability on the Target... Checking for TensorRT library availability on the Target... Checking for prerequisite libraries is complete. Gathering hardware details... Checking for third-party library availability on the Target... Gathering hardware details is complete. Board name : NVIDIA Jetson AGX Xavier Developer Kit CUDA Version : 11.4 cuDNN Version : 8.4 TensorRT Version : 8.4 GStreamer Version : 1.16.3 V4L2 Version : 1.18.0-2build1 SDL Version : 1.2 OpenCV Version : 4.5.4 Available Webcams : Logitech Webcam C925e Available GPUs : Xavier Available Digital Pins : 7 11 12 13 15 16 18 19 21 22 23 24 26 29 31 32 33 35 36 37 38 40
NOTE:
In case of a connection failure, a diagnostics error message is reported on the MATLAB command line. If the connection has failed, the most likely cause is incorrect IP address or hostname.
Verify GPU Environment
Use the coder.checkGpuInstall
function to verify that the compilers and libraries necessary for running this example are set up correctly.
envCfg = coder.gpuEnvConfig('jetson'); envCfg.DeepLibTarget = 'cudnn'; envCfg.DeepCodegen = 1; envCfg.Quiet = 1; envCfg.HardwareObject = hwobj; coder.checkGpuInstall(envCfg);
The net_classify
Entry-Point Function
The net_classify
entry-point function hardcodes the name of a video file. Note that this hardcoded path must be adjusted to the location of the video file on your target hardware. The entry-point function then reads the data from the file using a VideoReader
object. The data is read into MATLAB as a sequence of images (video frames). This data is then center-cropped, and finally passed as input to a trained network for prediction. Specifically, the function uses the network trained in the Classify Videos Using Deep Learning example. The function loads the network object from the net.mat
file into a persistent variable and reuses the persistent object for subsequent prediction calls.
type('net_classify.m')
function out = net_classify() %#codegen if coder.target('MATLAB') videoFilename = 'situp.mp4'; else videoFilename = '/home/ubuntu/VideoClassify/situp.mp4'; end frameSize = [1920 1080]; % read video video = readVideo(videoFilename, frameSize); % specify network input size inputSize = [224 224 3]; % crop video croppedVideo = centerCrop(video,inputSize); % A persistent object mynet is used to load the series network object. At % the first call to this function, the persistent object is constructed and % setup. When the function is called subsequent times, the same object is % reused to call predict on inputs, thus avoiding reconstructing and % reloading the network object. persistent mynet; if isempty(mynet) mynet = coder.loadDeepLearningNetwork('net.mat'); end % pass in cropped input to network out = classify(mynet, croppedVideo); % Copyright 2019-2021 The MathWorks, Inc.
About the Network
The network used to classify video input has a few notable features:
1. The network has a sequence input layer to accept images sequences as input.
2. The network uses a sequence folding layer followed by convolutional layers to apply the convolutional operations to each video frame independently, thereby extracting features from each frame.
3. The network uses a sequence unfolding layer and a flatten layer to restore the sequence structure and reshape the output to vector sequences, in anticipation of the BiLSTM layer.
4. Finally, the network uses the BiLSTM layer followed by output layers to classify the vector sequences.
To display an interactive visualization of the network architecture and information about the network layers, use the analyzeNetwork
(Deep Learning Toolbox) function.
Run net_classify
in MATLAB
Download the video classification network.
getVideoClassificationNetwork();
Loop over the individual frames of situp.mp4
to view the test video in MATLAB.
videoFileName = 'situp.mp4'; video = readVideo(videoFileName); numFrames = size(video,4); figure for i = 1:numFrames frame = video(:,:,:,i); imshow(frame/255); drawnow end
Run net_classify
and note the output label. Note that if there is a host GPU available, it will be automatically used when running net_classify
.
net_classify()
ans = categorical
situp
Generate & Deploy CUDA Code on the Target
To generate a CUDA executable that can be deployed to an NVIDIA target, create a new GPU coder configuration object for generating an executable. Set the target deep learning library to 'cudnn'.
clear cfg cfg = coder.gpuConfig('exe'); cfg.DeepLearningConfig = coder.DeepLearningConfig('cudnn');
Use the coder.hardware
function to create a configuration object for the Jetson platform and assign it to the Hardware
property of the GPU code configuration object cfg
.
cfg.Hardware = coder.hardware('NVIDIA Jetson');
Set the build directory on the target hardware. Change the example path below to the location on your target hardware where you would like the generated code to be placed.
cfg.Hardware.BuildDir = '/home/ubuntu/VideoClassify';
The custom main file main.cu
is a wrapper that calls the net_classify
function in the generated library.
cfg.CustomInclude = '.'; cfg.CustomSource = fullfile('main.cu');
Run the codegen
command. This time, code will be generated and then copied over to the target board. The executable will then be built on the target board.
codegen -config cfg net_classify
Code generation successful: View report
Run the Generated Application on the Target
Copy the test video file situp.mp4
from the host computer to the target device by using the putFile
command. Ensure that this video file is placed in the location hardcoded in the entry-point function net_classify
. In this example, this location happens to be the target hardware build directory.
putFile(hwobj,videoFileName,cfg.Hardware.BuildDir);
Use runApplication
to launch the application on the target hardware. The label will be displayed in the output terminal on the target.
status = evalc("runApplication(hwobj,'net_classify')");
See Also
Functions
coder.checkGpuInstall
|codegen
|coder.DeepLearningConfig
|coder.loadDeepLearningNetwork
|jetson
|runApplication
|killApplication
|VideoReader
Objects
Related Examples
- Classify Videos Using Deep Learning (Deep Learning Toolbox)
- Getting Started with the MATLAB Coder Support Package for NVIDIA Jetson and NVIDIA DRIVE Platforms