Ambisonic Binaural Decoding

Open Live Script

This example shows how to decode ambisonic audio into binaural audio using virtual loudspeakers. A virtual loudspeaker is a sound source positioned on the surface of a sphere, with the listener located at the center of the sphere. Each virtual loudspeaker has a pair of Head-Related Transfer Functions (HRTF) associated with it: one for the left ear and one for the right ear. The virtual loudspeaker locations along with the ambisonic order are used to calculate the ambisonic decoder matrix. The output of the decoder is filtered by the HRTFs corresponding to the virtual loudspeaker position. The signals from the left HRTFs are summed together and fed to the left ear. The signals from the right HRTFs are summed together and fed to the right ear. A block diagram of the audio signal flow is shown here.

Load the ARI HRTF Dataset

ARIDataset = sofaread("ReferenceHRTF.sofa");

Get the HRTF data in the required dimension of: [NumOfSourceMeasurements x 2 x LengthOfSamples]

hrtfData = ARIDataset.Numerator;

Confirm that the source position is expressed in spherical coordinates.

ARIDataset.SourcePositionType

ans = 
'spherical'

Get the source azimuth and elevation angle values.

sourcePosition = ARIDataset.SourcePosition(:,[1,2]);

The ARI HRTF Databases used in this example is based on the work by Acoustics Research Institute. The HRTF data and source position in ReferenceHRTF.sofa are from ARI NH2 subject.

The HRTF Databases by Acoustics Research Institute, Austrian Academy of Sciences are licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License: https://creativecommons.org/licenses/by-sa/3.0/.

Select Points from ARI HRTF Dataset

Now that the HRTF Dataset is loaded, determine which points to pick for virtual loudspeakers. This example picks random points distributed on the surface of a sphere and selects the points of the HRTF dataset closest to the picked points.

Pick random points from a spherical distribution
Compare sphere to points from the HRTF dataset
Pick the points with the shortest distance between them

% Create a sphere with a distribution of points
nPoints = 24;   % number of points to pick
rng(0);         % seed random number generator
sphereAZ = 360*rand(1,nPoints);
sphereEL = rad2deg(acos(2*rand(1,nPoints)-1))-90;
pickedSphere = [sphereAZ' sphereEL'];

% Compare distributed points on the sphere to points from the HRTF dataset
pick = zeros(1, nPoints);
d = zeros(size(pickedSphere,1), size(sourcePosition,1));
for ii = 1:size(pickedSphere,1)
    for jj = 1:size(sourcePosition,1)
        % Calculate arc length
        d(ii,jj) = acos( ...
            sind(pickedSphere(ii,2))*sind(sourcePosition(jj,2)) + ...
            cosd(pickedSphere(ii,2))*cosd(sourcePosition(jj,2)) * ... 
            cosd(pickedSphere(ii,1) - sourcePosition(jj,1)));
    end
    [~,Idx] = sort(d(ii,:)); % Sort points
    pick(ii) = Idx(1);       % Pick the closest point
end

Create Ambisonic Decoder

Specify a desired ambisonic order and desired virtual loudspeaker source positions as inputs to the audioexample.ambisonics.ambidecodemtrx helper function. The function returns an ambisonics decoder matrix.

order = 7;
devices = sourcePosition(pick,:)';
dmtrx = audioexample.ambisonics.ambidecodemtrx(order, devices);

Create HRTF Filters

Create an FIR filter to perform binaural HRTF filtering based on the position of the virtual loudspeakers.

filters = squeeze(hrtfData(pick,:,:));
filters = permute(filters,[2 1 3]);
filters = reshape(filters,size(filters,1)*size(filters,2),[]);
filt = dsp.FrequencyDomainFIRFilter(filters, SumFilteredOutputs=true);

Create Audio Input and Output Objects

Load the ambisonic audio file of helicopter sound and convert it to 48 kHz for compatibility with the HRTF dataset. Specify the ambisonic format of the audio file.

Create an audio file sampled at 48 kHz for compatibility with the HRTF dataset.

desiredFs = 48e3;
[audio,fs] = audioread("Heli_16ch_ACN_SN3D.wav");
audio = resample(audio,desiredFs,fs);
audiowrite("Heli_16ch_ACN_SN3D_48.wav",audio,desiredFs);

Specify the ambisonic format of the audio file. Set up the audio input and audio output objects.

format = "acn-sn3d";
samplesPerFrame = 2048;
fileReader = dsp.AudioFileReader("Heli_16ch_ACN_SN3D_48.wav", ...
                    SamplesPerFrame=samplesPerFrame);
deviceWriter = audioDeviceWriter(SampleRate=desiredFs);
audioFiltered = zeros(samplesPerFrame,size(filters,1),2);

Process Audio

while ~isDone(fileReader)
    audioAmbi = fileReader();
    audioDecoded = audioexample.ambisonics.ambidecode(audioAmbi, dmtrx, format);
    audioFiltered = 10*filt(audioDecoded);
    numUnderrun = deviceWriter(audioFiltered); 
end

% Release resources
release(fileReader)
release(deviceWriter)

References

[1] Kronlachner, M. (2014). Spatial Transformations for the Alteration of Ambisonic Recordings (Master's thesis).

[2] Noisternig, Markus. et al. "A 3D Ambisonic Based Binaural Sound Reproduction System." Presented at 24th AES International Conference: Multichannel Audio, The New Reality, Alberta, June 2003.