shiftPitch
Shift audio pitch
Description
audioOut = shiftPitch(audioIn,nsemitones)nsemitones.
audioOut = shiftPitch(audioIn,nsemitones,Name,Value)Name,Value pair arguments.
Examples
Read in an audio file and listen to it.
[audioIn,fs] = audioread('Counting-16-44p1-mono-15secs.wav');
sound(audioIn,fs)Increase the pitch by 3 semitones and listen to the result.
nsemitones = 3; audioOut = shiftPitch(audioIn,nsemitones); sound(audioOut,fs)
Decrease the pitch of the original audio by 3 semitones and listen to the result.
nsemitones = -3; audioOut = shiftPitch(audioIn,nsemitones); sound(audioOut,fs)
Read in an audio file and listen to it.
[audioIn,fs] = audioread("SpeechDFT-16-8-mono-5secs.wav");
sound(audioIn,fs)Convert the audio signal to a time-frequency representation using stft. Use a 512-point kbdwin with 75% overlap.
win = kbdwin(512); overlapLength = 0.75*numel(win); S = stft(audioIn, ... "Window",win, ... "OverlapLength",overlapLength, ... "Centered",false);
Increase the pitch by 8 semitones and listen to the result. Specify the window and overlap length you used to compute the STFT.
nsemitones =8; lockPhase =
false; audioOut = shiftPitch(S,nsemitones, ... "Window",win, ... "OverlapLength",overlapLength, ... "LockPhase",lockPhase); sound(audioOut,fs)
Decrease the pitch of the original audio by 8 semitones and listen to the result. Specify the window and overlap length you used to compute the STFT.
nsemitones =-8; lockPhase =
false; audioOut = shiftPitch(S,nsemitones, ... "Window",win, ... "OverlapLength",overlapLength, ... "LockPhase",lockPhase); sound(audioOut,fs)
Read in an audio file and listen to it.
[audioIn,fs] = audioread('FemaleSpeech-16-8-mono-3secs.wav');
sound(audioIn,fs)Increase the pitch by 6 semitones and listen to the result.
nsemitones = 6; lockPhase = false; audioOut = shiftPitch(audioIn,nsemitones, ... 'LockPhase',lockPhase); sound(audioOut,fs)
To increase fidelity, set LockPhase to true. Apply pitch shifting, and listen to the results.
lockPhase = true; audioOut = shiftPitch(audioIn,nsemitones, ... 'LockPhase',lockPhase); sound(audioOut,fs)
Read in the first 11.5 seconds of an audio file and listen to it.
[audioIn,fs] = audioread('Rainbow-16-8-mono-114secs.wav',[1,8e3*11.5]);
sound(audioIn,fs)Increase the pitch by 4 semitones and apply phase locking. Listen to the results. The resulting audio has a "chipmunk effect" that sounds unnatural.
nsemitones =4; lockPhase =
true; audioOut = shiftPitch(audioIn,nsemitones, ... "LockPhase",lockPhase); sound(audioOut,fs)
To increase fidelity, set PreserveFormants to true. Use the default cepstral order of 30. Listen to the result.
cepstralOrder =30; audioOut = shiftPitch(audioIn,nsemitones, ... "LockPhase",lockPhase, ... "PreserveFormants",true, ... "CepstralOrder",cepstralOrder); sound(audioOut,fs)
Input Arguments
Input signal, specified as a column vector, matrix, or 3-D array. How the function
            interprets audioIn depends on the complexity of
              audioIn:
- If - audioInis real,- audioInis interpreted as a time-domain signal. In this case,- audioInmust be a column vector or matrix. Columns are interpreted as individual channels.
- If - audioInis complex,- audioInis interpreted as a frequency-domain signal. In this case,- audioInmust be an L-by-M-by-N array, where L is the FFT length, M is the number of individual spectra, and N is the number of channels.
Data Types: single | double
Complex Number Support: Yes
Number of semitones to shift the audio by, specified as a real scalar.
The range of nsemitones depends on the window length
                (numel() and the overlap length
              (Window)OverlapLength):
-12*log2(numel(
            ≤ Window)-OverlapLength)nsemitones ≤
                -12*log2((numel(Window)-OverlapLength)/numel(Window))
Data Types: single | double
Name-Value Arguments
Specify optional pairs of arguments as
      Name1=Value1,...,NameN=ValueN, where Name is
      the argument name and Value is the corresponding value.
      Name-value arguments must appear after other arguments, but the order of the
      pairs does not matter.
    
      Before R2021a, use commas to separate each name and value, and enclose 
      Name in quotes.
    
Example: 'Window',kbdwin(512)
Window applied in the time domain, specified as the comma-separated pair
              consisting of 'Window' and a real vector. The number of elements in
              the vector must be in the range [1,
                size(]. The number of elements in
              the vector must also be greater than audioIn,1)OverlapLength.
Note
If using shiftPitch with frequency-domain input, you must
                specify Window as the same window used to transform
                  audioIn to the frequency domain.
Data Types: single | double
Number of samples overlapped between adjacent windows, specified as the
              comma-separated pair consisting of 'OverlapLength' and an integer
              in the range [0, numel(Window)).
Note
If using shiftPitch with frequency-domain input, you must
                specify OverlapLength as the same overlap length used to
                transform audioIn to a time-frequency representation.
Data Types: single | double
Apply identity phase locking, specified as the comma-separated pair consisting of
                'LockPhase' and false or
                true.
Data Types: logical
Preserves formants, specified as the comma-separated pair consisting of
                'PreserveFormants' and true or
                false. Formant preservation is attempted using spectral envelope
              estimation with cepstral analysis.
Data Types: logical
Cepstral order used for formant preservation, specified as the comma-separated
              pair consisting of 'CepstralOrder' and a nonnegative
              integer.
Dependencies
To enable this name-value pair argument, set
                  PreserveFormants to true.
Data Types: single | double
Output Arguments
Pitch-shifted audio, returned as a column vector or matrix of independent channels.
Algorithms
To apply pitch shifting, shiftPitch modifies the time-scale of audio
      using a phase vocoder and then resamples the modified audio. The time scale modification
      algorithm is based on [1] and [2] and is implemented as in
        stretchAudio.
After time-scale modification, shiftPitch performs sample rate
      conversion using an interpolation factor equal to the analysis hop length and a decimation
      factor equal to the synthesis hop length. The interpolation and decimation factors of the
      resampling stage are selected as follows: The analysis hop length is determined as
        analysisHopLength =
          numel(. The
        Window)-OverlapLengthshiftPitch function assumes that there are 12 semitones in an octave,
      so the speedup factor used to stretch the audio is speedupFactor =
          2^(-. The speedup factor and analysis hop
      length determine the synthesis hop length for time-scale modification as
        nsemitones/12)synthesisHopLength = round((1/SpeedupFactor)*analysisHopLength).
The achievable pitch shift is determined by the window length
          (numel() and
        Window)OverlapLength. To see the relationship, note that the equation for
      speedup factor can be rewritten as: nsemitones =
        -12*log2(speedupFactor)speedupFactor = analysisHopLengh/synthesisHopLength. Using
      simple substitution, nsemitones =
        -12*log2(analysisHopLength/synthesisHopLength). The practical range of a synthesis
      hop length is [1, numel(]. The range of
      achievable pitch shifts is:Window)
- Max number of semitones lowered: - -12*log2(numel(- Window)-- OverlapLength)
- Max number of semitones raised: - -12*log2((numel(- Window)-- OverlapLength)/numel(- Window))
Pitch shifting can alter the spectral envelope of the pitch-shifted signal. To diminish
        this effect, you can set PreserveFormants to true.
        If PreserveFormants is set to true, the algorithm
        attempts to estimate the spectral envelope using an iterative procedure in the cepstral
        domain, as described in [3] and [4]. For both the original
        spectrum, X, and the pitch-shifted spectrum, Y, the
        algorithm estimates the spectral envelope as follows.
For the first iteration, EnvXa is set to X. Then, the algorithm repeats these two steps in a loop:
- Lowpass filters the cepstral representation of EnvXa to get a new estimate, EnvXb. The - CepstralOrderparameter controls the quefrency bandwidth.
- To update the current best fit, the algorithm takes the element-by-element maximum of the current spectral envelope estimate and the previous spectral envelope estimate: 

The loop ends if either a maximum number of iterations
          (100) is reached, or if all bins of the estimated log envelope are
        within a given tolerance of the original log spectrum. The tolerance is set to
          log(10^(1/20)).
Finally, the algorithm scales the spectrum of the pitch-shifted audio by the ratio of estimated envelopes, element-wise:
References
[1] Driedger, Johnathan, and Meinard Müller. "A Review of Time-Scale Modification of Music Signals." Applied Sciences. Vol. 6, Issue 2, 2016.
[2] Driedger, Johnathan. "Time-Scale Modification Algorithms for Music Audio Signals." Master's Thesis. Saarland University, Saarbrücken, Germany, 2011.
[3] Axel Roebel, and Xavier Rodet. "Efficient Spectral Envelope Estimation and its application to pitch shifting and envelope preservation." International Conference on Digital Audio Effects, pp. 30–35. Madrid, Spain, September 2005. hal-01161334
[4] S. Imai, and Y. Abe. "Spectral envelope extraction by improved cepstral method." Electron. and Commun. in Japan. Vol. 62-A, Issue 4, 1997, pp. 10–17.
Extended Capabilities
C/C++ Code Generation
 Generate C and C++ code using MATLAB® Coder™.
Usage notes and limitations:
- LockPhasemust be set to- false.
- Using - gpuArray(Parallel Computing Toolbox) input with- shiftPitchis only recommended for a GPU with compute capability 7.0 ("Volta") or above. Other hardware might not offer any performance advantage. To check your GPU compute capability, see- ComputeCompabilityin the output from the- gpuDevice(Parallel Computing Toolbox) function. For more information, see GPU Computing Requirements (Parallel Computing Toolbox).
For an overview of GPU usage in MATLAB®, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox).
Version History
Introduced in R2019b
See Also
stretchAudio | reverberator | audioTimeScaler | audioDataAugmenter
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
웹사이트 선택
번역된 콘텐츠를 보고 지역별 이벤트와 혜택을 살펴보려면 웹사이트를 선택하십시오. 현재 계신 지역에 따라 다음 웹사이트를 권장합니다:
또한 다음 목록에서 웹사이트를 선택하실 수도 있습니다.
사이트 성능 최적화 방법
최고의 사이트 성능을 위해 중국 사이트(중국어 또는 영어)를 선택하십시오. 현재 계신 지역에서는 다른 국가의 MathWorks 사이트 방문이 최적화되지 않았습니다.
미주
- América Latina (Español)
- Canada (English)
- United States (English)
유럽
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)