Main Content

Pretrained Models

Transfer learning, sound classification, feature embeddings, pretrained audio deep learning networks

Audio Toolbox™ provides MATLAB® and Simulink® support for pretrained audio deep learning networks. Locate and classify sounds with YAMNet and estimate pitch with CREPE. Extract VGGish or OpenL3 feature embeddings to input to machine learning and deep learning systems. Use i-vector systems to produce compact representations of audio signals for applications such as speaker recognition, verification, identification, and diarization. Use detectspeechnn to perform voice activity detection (VAD).

Using pretrained deep learning networks requires Deep Learning Toolbox™. The Audio Toolbox pretrained networks are available in Deep Network Designer (Deep Learning Toolbox).


expand all

vggishEmbeddingsExtract VGGish feature embeddings
vggishVGGish neural network
vggishPreprocessPreprocess audio for VGGish feature extraction
classifySoundClassify sounds in audio signal
yamnetYAMNet neural network
yamnetGraphGraph of YAMNet AudioSet ontology
yamnetPreprocessPreprocess audio for YAMNet classification
openl3EmbeddingsExtract OpenL3 feature embeddings
openl3OpenL3 neural network
openl3PreprocessPreprocess audio for OpenL3 feature extraction
pitchnnEstimate pitch with deep learning neural network
crepeCREPE neural network
crepePreprocessPreprocess audio for CREPE deep learning network
crepePostprocessPostprocess output of CREPE deep learning network
speakerRecognitionPretrained speaker recognition system
ivectorSystemCreate i-vector system
detectspeechnnDetect boundaries of speech in audio signal using AI
vadnetVoice activity detection (VAD) neural network
vadnetPreprocessPreprocess audio for voice activity detection (VAD) network
vadnetPostprocessPostprocess frame-based VAD probabilities


expand all

VGGish EmbeddingsExtract VGGish embeddings
VGGish PreprocessPreprocess audio for VGGish feature extraction
VGGishVGGish embeddings extraction network
Sound ClassifierClassify sounds in audio signal
YAMNetYAMNet sound classification network
YAMNet PreprocessPreprocess audio for YAMNet classification
OpenL3 EmbeddingsExtract OpenL3 embeddings
OpenL3 PreprocessPreprocess audio for OpenL3 embeddings extraction
OpenL3OpenL3 embeddings extraction network
Deep Pitch EstimatorEstimate pitch with CREPE deep learning neural network
CREPECREPE deep pitch estimation neural network
CREPE PreprocessPreprocess audio for CREPE deep pitch estimation
CREPE PostprocessPostprocess output of CREPE pitch estimation network


Deep Network DesignerDesign, visualize, and train deep learning networks