특징 추출

멜 스펙트로그램, MFCC, 피치, 스펙트럼 설명자

머신러닝 또는 딥러닝 시스템에 대한 입력으로 사용할 오디오 신호에서 특징을 추출합니다. 개별 함수(예: melSpectrogram, mfcc, pitch, spectralCentroid)를 사용하거나 audioFeatureExtractor 객체를 사용하여 중복 계산을 최소화하는 특징 추출 파이프라인을 만듭니다. Simulink^®의 오디오 신호에서 특징을 추출하려면 Mel Spectrogram 및 MFCC와 같은 블록을 사용하십시오. 라이브 스크립트에서는 오디오 특징 추출을 사용하여 추출할 특징을 시각적으로 선택하십시오.

객체

`audioFeatureExtractor`	Streamline audio feature extraction
`ivectorSystem`	Create i-vector system (R2021a 이후)

라이브 편집기 작업

오디오 특징 추출

라이브 편집기에서 오디오 특징 추출 간소화

함수

모두 확장

청각 스펙트로그램

`audioDelta`	Compute delta features
`designAuditoryFilterBank`	Design auditory filter bank
`melSpectrogram`	멜 스펙트로그램

청각 켑스트럼 계수

`audioDelta`	Compute delta features
`cepstralCoefficients`	Extract cepstral coefficients
`gtcc`	Extract gammatone cepstral coefficients, log-energy, delta, and delta-delta
`mfcc`	오디오 신호의 MFCC, 로그 에너지, 델타, 델타-델타 추출

특징 임베딩

`openl3Embeddings`	Extract OpenL3 feature embeddings (R2022a 이후)
`vggishEmbeddings`	Extract VGGish feature embeddings (R2022a 이후)
`speakerEmbeddings`	Extract speaker embeddings from speech (R2024b 이후)

주기성 및 조화성

`audioDelta`	Compute delta features
`harmonicRatio`	Harmonic ratio
`pitch`	오디오 신호의 기본주파수 추정
`pitchnn`	Estimate pitch with deep learning neural network (R2021a 이후)

스펙트럼 설명자

`audioDelta`	Compute delta features
`spectralCentroid`	Spectral centroid for audio signals and auditory spectrograms
`spectralCrest`	Spectral crest for signals and spectrograms
`spectralDecrease`	Spectral decrease for audio signals and auditory spectrograms
`spectralEntropy`	Spectral entropy for signals and spectrograms
`spectralFlatness`	Spectral flatness for signals and spectrograms
`spectralFlux`	Spectral flux for audio signals and auditory spectrograms
`spectralKurtosis`	Spectral kurtosis for signals and spectrograms
`spectralRolloffPoint`	Spectral rolloff point for audio signals and auditory spectrograms
`spectralSkewness`	Spectral skewness for signals and spectrograms
`spectralSlope`	Spectral slope for audio signals and auditory spectrograms
`spectralSpread`	Spectral spread for audio signals and auditory spectrograms

영역 변환

`erb2hz`	Convert from equivalent rectangular bandwidth (ERB) scale to hertz
`bark2hz`	Convert from Bark scale to hertz
`mel2hz`	Convert from mel scale to hertz
`hz2erb`	Convert from hertz to equivalent rectangular bandwidth (ERB) scale
`hz2bark`	Convert from hertz to Bark scale
`hz2mel`	Convert from hertz to mel scale
`phon2sone`	Convert from phon to sone
`sone2phon`	Convert from sone to phon

블록

Audio Delta	Compute delta features (R2022b 이후)
Auditory Spectrogram	Extract mel, Bark, or ERB spectrogram from audio (R2022a 이후)
Cepstral Coefficients	Extract cepstral coefficients from spectrogram (R2022b 이후)
Design Auditory Filter Bank	Design frequency-domain auditory filter bank (R2022a 이후)
Design Mel Filter Bank	Design frequency-domain mel filter bank (R2022a 이후)
Mel Spectrogram	Extract mel spectrogram from audio (R2022a 이후)
MFCC	Extract mel-frequency cepstral coefficients from audio (R2022b 이후)

도움말 항목

Feature Selection for Audio Classification
Perform audio feature selection to select a feature set for either speaker recognition or word recognition tasks.
Extract Features from Audio Data Sets
Use different methods of extracting features from an audio data set.
Spectral Descriptors
Overview and applications of spectral descriptors.
Learn Pre-Emphasis Filter Using Deep Learning
Use a convolutional deep network to learn a pre-emphasis filter for speech recognition. (R2021b 이후)

추천 예제

Train Spoken Digit Recognition Network Using Out-of-Memory Features

Trains a spoken digit recognition network on out-of-memory auditory spectrograms using a transformed datastore. In this example, you extract auditory spectrograms from audio using audioDatastore and audioFeatureExtractor, and you write them to disk. You then use a signalDatastore to access the features during training. The workflow is useful when the training features do not fit in memory. In this workflow, you only extract features once, which speeds up your workflow if you are iterating on the deep learning model design.

라이브 스크립트 열기

Sequential Feature Selection for Audio Features

A typical workflow for feature selection applied to the task of spoken digit recognition.

라이브 스크립트 열기

Pitch Tracking Using Multiple Pitch Estimations and HMM

Perform pitch tracking using multiple pitch estimations, octave and median smoothing, and a hidden Markov model (HMM).

라이브 스크립트 열기