Main Content


Detect and isolate speech and other sounds

Detect speech and other sounds and locate their start and end times. For streaming applications, use a voice activity detector (VAD) to output the probability that speech is present in a given frame. You can also use speech2text to create time-aligned word labels for speech signals.


Signal LabelerLabel signal attributes, regions, and points of interest, and extract features


voiceActivityDetectorDetect presence of speech in audio signal


enhanceSpeechEnhance speech signal (Since R2024a)
separateSpeakersSeparate signal by speakers (Since R2023b)
detectspeechnnDetect boundaries of speech in audio signal using AI (Since R2023a)
detectSpeechDetect boundaries of speech in audio signal (Since R2020a)
classifySoundClassify sounds in audio signal (Since R2020b)
identifyLanguageIdentify languages in speech signals (Since R2024b)


Voice Activity DetectorDetect presence of speech in audio signal


Featured Examples

Go to top of page