Main Content

splitEachLabel

Splits datastore according to specified label proportions

Description

[ADS1,ADS2] = splitEachLabel(ADS,p) splits the audio files in ADS into two new datastores, ADS1 and ADS2. The new datastore ADS1 contains the first p files from each label ,and ADS2 contains the remaining files from each label. p can be either a number between 0 and 1, exclusive, indicating the percentage of the files from each label to assign to ADS1, or an integer indicating the absolute number of files from each label to assign to ADS1.

example

[ADS1,...,ADSM] = splitEachLabel(ADS,p1,...,pN) splits the datastore into N+1 new datastores. The new datastore ADS1 contains the first p1 files from each label, the next new datastore ADS2 contains the next p2 files, and so on. If p1,…,pN represent numbers of files, then their sum must be no more than the number of files in the smallest label in the original datastore, ADS.

example

___ = splitEachLabel(___,'randomized') randomly assigns the specified proportion of files from each label to the new datastores.

example

___ = splitEachLabel(___,Name,Value) specifies the properties of the new datastores using one or more name-value pair arguments. For example, you can specify which labels to split with 'Include','labelname'.

example

Examples

collapse all

Specify the file path to the audio samples included with Audio Toolbox™. Create an audio datastore that points to the specified folder.

folder = fullfile(matlabroot,'toolbox','audio','samples');
ADS = audioDatastore(folder,'FileExtensions','.wav');

Add the label A to the first half of the files, and the label B to the second half. If there are an odd number of files, assign the extra file the label B. Call countEachLabel to confirm that half of the files are labeled A and half the files are labeled B.

labels = [repmat({'A'},1,floor(numel(ADS.Files)/2)), ...
          repmat({'B'},1,ceil(numel(ADS.Files)/2))];
ADS.Labels = labels;

countEachLabel(ADS)
ans=2×2 table
    Label    Count
    _____    _____

      A       10  
      B       10  

Split ADS into two datastores, ADS1 and ADS2, specifying that each new datastore contains fifty percent of each label and the corresponding files. Call countEachLabel to confirm that half of the files are labeled A and half of the files are labeled B for each of the new datastores.

[ADS1,ADS2] = splitEachLabel(ADS,0.5)
ADS1 = 
  audioDatastore with properties:

                       Files: {
                              ' .../matlab/toolbox/audio/samples/Ambiance-16-44p1-mono-12secs.wav';
                              ' .../matlab/toolbox/audio/samples/AudioArray-16-16-4channels-20secs.wav';
                              ' .../toolbox/audio/samples/ChurchImpulseResponse-16-44p1-mono-5secs.wav'
                               ... and 7 more
                              }
                     Folders: {
                              ' .../Bdoc24b.2679053/build/runnable/matlab/toolbox/audio/samples'
                              }
                      Labels: {'A'; 'A'; 'A' ... and 7 more}
    AlternateFileSystemRoots: {}
              OutputDataType: 'double'
           OutputEnvironment: 'cpu'
      SupportedOutputFormats: ["wav"    "flac"    "ogg"    "opus"    "mp3"    "mp4"    "m4a"]
         DefaultOutputFormat: "wav"

ADS2 = 
  audioDatastore with properties:

                       Files: {
                              ' .../runnable/matlab/toolbox/audio/samples/Engine-16-44p1-stereo-20sec.wav';
                              ' .../matlab/toolbox/audio/samples/FemaleSpeech-16-8-mono-3secs.wav';
                              ' .../build/runnable/matlab/toolbox/audio/samples/Heli_16ch_ACN_SN3D.wav'
                               ... and 7 more
                              }
                     Folders: {
                              ' .../Bdoc24b.2679053/build/runnable/matlab/toolbox/audio/samples'
                              }
                      Labels: {'A'; 'A'; 'A' ... and 7 more}
    AlternateFileSystemRoots: {}
              OutputDataType: 'double'
           OutputEnvironment: 'cpu'
      SupportedOutputFormats: ["wav"    "flac"    "ogg"    "opus"    "mp3"    "mp4"    "m4a"]
         DefaultOutputFormat: "wav"

ADS1count = countEachLabel(ADS1)
ADS1count=2×2 table
    Label    Count
    _____    _____

      A        5  
      B        5  

ADS2count = countEachLabel(ADS2)
ADS2count=2×2 table
    Label    Count
    _____    _____

      A        5  
      B        5  

Specify the file path to the audio samples included with Audio Toolbox™. Create an audio datastore that points to the specified folder.

folder = fullfile(matlabroot,'toolbox','audio','samples');
ADS = audioDatastore(folder,'FileExtensions','.wav');

Add the label A to the first half of the files, and the label B to the second half. If there are an odd number of files, assign the extra file the label B. Call countEachLabel to confirm that half of the files are labeled A and half the files are labeled B.

labels = [repmat({'A'},1,floor(numel(ADS.Files)/2)), ...
          repmat({'B'},1,ceil(numel(ADS.Files)/2))];
ADS.Labels = labels;

countEachLabel(ADS)
ans=2×2 table
    Label    Count
    _____    _____

      A       10  
      B       10  

Split ADS into two datastores, ADS1 and ADS2. Specify that ADS1 contains four of each label and its corresponding file. ADS2 contains the remaining labels and corresponding files. Call countEachLabel to confirm that ADS1 contains four files labeled A and four files labeled B, and that ADS2 contains the remaining labels.

[ADS1,ADS2] = splitEachLabel(ADS,4)
ADS1 = 
  audioDatastore with properties:

                       Files: {
                              ' .../matlab/toolbox/audio/samples/Ambiance-16-44p1-mono-12secs.wav';
                              ' .../matlab/toolbox/audio/samples/AudioArray-16-16-4channels-20secs.wav';
                              ' .../toolbox/audio/samples/ChurchImpulseResponse-16-44p1-mono-5secs.wav'
                               ... and 5 more
                              }
                     Folders: {
                              ' .../Bdoc24b.2679053/build/runnable/matlab/toolbox/audio/samples'
                              }
                      Labels: {'A'; 'A'; 'A' ... and 5 more}
    AlternateFileSystemRoots: {}
              OutputDataType: 'double'
           OutputEnvironment: 'cpu'
      SupportedOutputFormats: ["wav"    "flac"    "ogg"    "opus"    "mp3"    "mp4"    "m4a"]
         DefaultOutputFormat: "wav"

ADS2 = 
  audioDatastore with properties:

                       Files: {
                              ' .../matlab/toolbox/audio/samples/Counting-16-44p1-mono-15secs.wav';
                              ' .../runnable/matlab/toolbox/audio/samples/Engine-16-44p1-stereo-20sec.wav';
                              ' .../matlab/toolbox/audio/samples/FemaleSpeech-16-8-mono-3secs.wav'
                               ... and 9 more
                              }
                     Folders: {
                              ' .../Bdoc24b.2679053/build/runnable/matlab/toolbox/audio/samples'
                              }
                      Labels: {'A'; 'A'; 'A' ... and 9 more}
    AlternateFileSystemRoots: {}
              OutputDataType: 'double'
           OutputEnvironment: 'cpu'
      SupportedOutputFormats: ["wav"    "flac"    "ogg"    "opus"    "mp3"    "mp4"    "m4a"]
         DefaultOutputFormat: "wav"

ADS1count = countEachLabel(ADS1)
ADS1count=2×2 table
    Label    Count
    _____    _____

      A        4  
      B        4  

ADS2count = countEachLabel(ADS2)
ADS2count=2×2 table
    Label    Count
    _____    _____

      A        6  
      B        6  

Specify the file path to the audio samples included with Audio Toolbox™. Create an audio datastore that points to the specified folder.

folder = fullfile(matlabroot,'toolbox','audio','samples');
ADS = audioDatastore(folder,'FileExtensions','.wav');

Add the label A to the first half of the files, and the label B to the second half. If there is an odd number of files, assign the extra file the label B. Call countEachLabel to confirm that half of the files are labeled A and half the files are labeled B.

labels = [repmat({'A'},1,floor(numel(ADS.Files)/2)), ...
          repmat({'B'},1,ceil(numel(ADS.Files)/2))];
ADS.Labels = labels;

countEachLabel(ADS)
ans=2×2 table
    Label    Count
    _____    _____

      A       10  
      B       10  

Split ADS into three new datastores, ADS60, ADS10, and ADS30. The first datastore, ADS60, contains the first 60% of files with the A label and the first 60% of files with the B label. ADS10 contains the next 10% of files from each label. ADS30 contains the remaining 30% of files from each label. If the percentage applied to a label does not result in a whole number of files, splitEachLabel rounds down to the nearest whole number.

[ADS60,ADS10,ADS30] = splitEachLabel(ADS,0.6,0.1)
ADS60 = 
  audioDatastore with properties:

                       Files: {
                              ' .../matlab/toolbox/audio/samples/Ambiance-16-44p1-mono-12secs.wav';
                              ' .../matlab/toolbox/audio/samples/AudioArray-16-16-4channels-20secs.wav';
                              ' .../toolbox/audio/samples/ChurchImpulseResponse-16-44p1-mono-5secs.wav'
                               ... and 9 more
                              }
                     Folders: {
                              ' .../Bdoc24b.2679053/build/runnable/matlab/toolbox/audio/samples'
                              }
                      Labels: {'A'; 'A'; 'A' ... and 9 more}
    AlternateFileSystemRoots: {}
              OutputDataType: 'double'
           OutputEnvironment: 'cpu'
      SupportedOutputFormats: ["wav"    "flac"    "ogg"    "opus"    "mp3"    "mp4"    "m4a"]
         DefaultOutputFormat: "wav"

ADS10 = 
  audioDatastore with properties:

                       Files: {
                              ' .../matlab/toolbox/audio/samples/FemaleSpeech-16-8-mono-3secs.wav';
                              ' .../matlab/toolbox/audio/samples/TrainWhistle-16-44p1-mono-9secs.wav'
                              }
                     Folders: {
                              ' .../Bdoc24b.2679053/build/runnable/matlab/toolbox/audio/samples'
                              }
                      Labels: {'A'; 'B'}
    AlternateFileSystemRoots: {}
              OutputDataType: 'double'
           OutputEnvironment: 'cpu'
      SupportedOutputFormats: ["wav"    "flac"    "ogg"    "opus"    "mp3"    "mp4"    "m4a"]
         DefaultOutputFormat: "wav"

ADS30 = 
  audioDatastore with properties:

                       Files: {
                              ' .../build/runnable/matlab/toolbox/audio/samples/Heli_16ch_ACN_SN3D.wav';
                              ' .../matlab/toolbox/audio/samples/JetAirplane-16-11p025-mono-16secs.wav';
                              ' .../runnable/matlab/toolbox/audio/samples/Laughter-16-8-mono-4secs.wav'
                               ... and 3 more
                              }
                     Folders: {
                              ' .../Bdoc24b.2679053/build/runnable/matlab/toolbox/audio/samples'
                              }
                      Labels: {'A'; 'A'; 'A' ... and 3 more}
    AlternateFileSystemRoots: {}
              OutputDataType: 'double'
           OutputEnvironment: 'cpu'
      SupportedOutputFormats: ["wav"    "flac"    "ogg"    "opus"    "mp3"    "mp4"    "m4a"]
         DefaultOutputFormat: "wav"

Call countEachLabel to confirm the correct distribution of labels for each datastore.

countEachLabel(ADS60)
ans=2×2 table
    Label    Count
    _____    _____

      A        6  
      B        6  

countEachLabel(ADS10)
ans=2×2 table
    Label    Count
    _____    _____

      A        1  
      B        1  

countEachLabel(ADS30)
ans=2×2 table
    Label    Count
    _____    _____

      A        3  
      B        3  

Specify the file path to the audio samples included with Audio Toolbox™. Create an audio datastore that points to the specified folder.

folder = fullfile(matlabroot,'toolbox','audio','samples');
ADS = audioDatastore(folder,'FileExtensions','.wav');

Add the label A to the first half of the files, and the label B to the second half. If there is an odd number of files, assign the extra file the label B. Call countEachLabel to confirm that half of the files are labeled A and half the files are labeled B.

labels = [repmat({'A'},1,floor(numel(ADS.Files)/2)), ...
          repmat({'B'},1,ceil(numel(ADS.Files)/2))];
ADS.Labels = labels;

countEachLabel(ADS)
ans=2×2 table
    Label    Count
    _____    _____

      A       10  
      B       10  

Split ADS into three new datastores, ADS1, ADS2, and ADS3. The first datastore, ADS1, contains the first file with the A label and the first file with the B label. ADS2 contains the next file from each label. ADS3 contains the remaining files from each label. If the percentage applied to a label does not result in a whole number of files, splitEachLabel rounds down to the nearest whole number.

[ADS1,ADS2,ADS3] = splitEachLabel(ADS,1,1)
ADS1 = 
  audioDatastore with properties:

                       Files: {
                              ' .../matlab/toolbox/audio/samples/Ambiance-16-44p1-mono-12secs.wav';
                              ' .../matlab/toolbox/audio/samples/MainStreetOne-16-16-mono-12secs.wav'
                              }
                     Folders: {
                              ' .../Bdoc24b.2679053/build/runnable/matlab/toolbox/audio/samples'
                              }
                      Labels: {'A'; 'B'}
    AlternateFileSystemRoots: {}
              OutputDataType: 'double'
           OutputEnvironment: 'cpu'
      SupportedOutputFormats: ["wav"    "flac"    "ogg"    "opus"    "mp3"    "mp4"    "m4a"]
         DefaultOutputFormat: "wav"

ADS2 = 
  audioDatastore with properties:

                       Files: {
                              ' .../matlab/toolbox/audio/samples/AudioArray-16-16-4channels-20secs.wav';
                              ' .../matlab/toolbox/audio/samples/NoisySpeech-16-22p5-mono-5secs.wav'
                              }
                     Folders: {
                              ' .../Bdoc24b.2679053/build/runnable/matlab/toolbox/audio/samples'
                              }
                      Labels: {'A'; 'B'}
    AlternateFileSystemRoots: {}
              OutputDataType: 'double'
           OutputEnvironment: 'cpu'
      SupportedOutputFormats: ["wav"    "flac"    "ogg"    "opus"    "mp3"    "mp4"    "m4a"]
         DefaultOutputFormat: "wav"

ADS3 = 
  audioDatastore with properties:

                       Files: {
                              ' .../toolbox/audio/samples/ChurchImpulseResponse-16-44p1-mono-5secs.wav';
                              ' .../runnable/matlab/toolbox/audio/samples/Click-16-44p1-mono-0.2secs.wav';
                              ' .../matlab/toolbox/audio/samples/Counting-16-44p1-mono-15secs.wav'
                               ... and 13 more
                              }
                     Folders: {
                              ' .../Bdoc24b.2679053/build/runnable/matlab/toolbox/audio/samples'
                              }
                      Labels: {'A'; 'A'; 'A' ... and 13 more}
    AlternateFileSystemRoots: {}
              OutputDataType: 'double'
           OutputEnvironment: 'cpu'
      SupportedOutputFormats: ["wav"    "flac"    "ogg"    "opus"    "mp3"    "mp4"    "m4a"]
         DefaultOutputFormat: "wav"

Call countEachLabel to confirm the correct distribution of labels for each datastore.

countEachLabel(ADS1)
ans=2×2 table
    Label    Count
    _____    _____

      A        1  
      B        1  

countEachLabel(ADS2)
ans=2×2 table
    Label    Count
    _____    _____

      A        1  
      B        1  

countEachLabel(ADS3)
ans=2×2 table
    Label    Count
    _____    _____

      A        8  
      B        8  

Specify the file path to the audio samples included with Audio Toolbox™. Create an audio datastore that points to the specified folder.

folder = fullfile(matlabroot,'toolbox','audio','samples');
ADS = audioDatastore(folder,'FileExtensions','.wav')
ADS = 
  audioDatastore with properties:

                       Files: {
                              ' .../matlab/toolbox/audio/samples/Ambiance-16-44p1-mono-12secs.wav';
                              ' .../matlab/toolbox/audio/samples/AudioArray-16-16-4channels-20secs.wav';
                              ' .../toolbox/audio/samples/ChurchImpulseResponse-16-44p1-mono-5secs.wav'
                               ... and 17 more
                              }
                     Folders: {
                              ' .../Bdoc24b.2679053/build/runnable/matlab/toolbox/audio/samples'
                              }
    AlternateFileSystemRoots: {}
              OutputDataType: 'double'
           OutputEnvironment: 'cpu'
                      Labels: {}
      SupportedOutputFormats: ["wav"    "flac"    "ogg"    "opus"    "mp3"    "mp4"    "m4a"]
         DefaultOutputFormat: "wav"

Add the label A to the first half of the files, and the label B to the second half. If there is an odd number of files, assign the extra file the label B. Call countEachLabel to confirm that half of the files are labeled A and half the files are labeled B.

labels = [repmat({'A'},1,floor(numel(ADS.Files)/2)), ...
          repmat({'B'},1,ceil(numel(ADS.Files)/2))];
ADS.Labels = labels;

countEachLabel(ADS)
ans=2×2 table
    Label    Count
    _____    _____

      A       10  
      B       10  

Create two new datastores from the files in ADS by randomly drawing from each label. The first datastore, ADS1, contains two random files with the A label and two random files with the B label. ADS2 contains the remaining files from each label.

[ADS1,ADS2] = splitEachLabel(ADS,2,'randomized')
ADS1 = 
  audioDatastore with properties:

                       Files: {
                              ' .../toolbox/audio/samples/ChurchImpulseResponse-16-44p1-mono-5secs.wav';
                              ' .../runnable/matlab/toolbox/audio/samples/Engine-16-44p1-stereo-20sec.wav';
                              ' .../matlab/toolbox/audio/samples/MainStreetOne-16-16-mono-12secs.wav'
                               ... and 1 more
                              }
                     Folders: {
                              ' .../Bdoc24b.2679053/build/runnable/matlab/toolbox/audio/samples'
                              }
                      Labels: {'A'; 'A'; 'B' ... and 1 more}
    AlternateFileSystemRoots: {}
              OutputDataType: 'double'
           OutputEnvironment: 'cpu'
      SupportedOutputFormats: ["wav"    "flac"    "ogg"    "opus"    "mp3"    "mp4"    "m4a"]
         DefaultOutputFormat: "wav"

ADS2 = 
  audioDatastore with properties:

                       Files: {
                              ' .../matlab/toolbox/audio/samples/Ambiance-16-44p1-mono-12secs.wav';
                              ' .../matlab/toolbox/audio/samples/AudioArray-16-16-4channels-20secs.wav';
                              ' .../runnable/matlab/toolbox/audio/samples/Click-16-44p1-mono-0.2secs.wav'
                               ... and 13 more
                              }
                     Folders: {
                              ' .../Bdoc24b.2679053/build/runnable/matlab/toolbox/audio/samples'
                              }
                      Labels: {'A'; 'A'; 'A' ... and 13 more}
    AlternateFileSystemRoots: {}
              OutputDataType: 'double'
           OutputEnvironment: 'cpu'
      SupportedOutputFormats: ["wav"    "flac"    "ogg"    "opus"    "mp3"    "mp4"    "m4a"]
         DefaultOutputFormat: "wav"

Specify the file path to the audio samples included with Audio Toolbox™. Create an audio datastore that points to the specified folder.

folder = fullfile(matlabroot,'toolbox','audio','samples');
ADS = audioDatastore(folder,'FileExtensions','.wav')
ADS = 

  audioDatastore with properties:

                       Files: {
                              ' .../matlab/toolbox/audio/samples/Ambiance-16-44p1-mono-12secs.wav';
                              ' .../matlab/toolbox/audio/samples/AudioArray-16-16-4channels-20secs.wav';
                              ' .../toolbox/audio/samples/ChurchImpulseResponse-16-44p1-mono-5secs.wav'
                               ... and 17 more
                              }
                     Folders: {
                              ' .../Bdoc24b.2679053/build/runnable/matlab/toolbox/audio/samples'
                              }
    AlternateFileSystemRoots: {}
              OutputDataType: 'double'
           OutputEnvironment: 'cpu'
                      Labels: {}
      SupportedOutputFormats: ["wav"    "flac"    ...    ] (1x7 string)
         DefaultOutputFormat: "wav"

Add the label A to the first half of the files, and the label B to the second half. If there is an odd number of files, assign the extra file the label B. Call countEachLabel to confirm that half of the files are labeled A and half the files are labeled B.

labels = [repmat({'A'},1,floor(numel(ADS.Files)/2)), ...
          repmat({'B'},1,ceil(numel(ADS.Files)/2))];
ADS.Labels = labels;

countEachLabel(ADS)
ans =

  2x2 table

    Label    Count
    _____    _____

      A       10  
      B       10  

Create two new datastores from the files in ADS, including only the files with the A label. ADS1 contains the first 70% of files with the A label, and ADS2 contains the remaining 30% of labels with the A label.

[ADS1,ADS2] = splitEachLabel(ADS,0.7,'Include','A')
ADS1 = 

  audioDatastore with properties:

                       Files: {
                              ' .../matlab/toolbox/audio/samples/Ambiance-16-44p1-mono-12secs.wav';
                              ' .../matlab/toolbox/audio/samples/AudioArray-16-16-4channels-20secs.wav';
                              ' .../toolbox/audio/samples/ChurchImpulseResponse-16-44p1-mono-5secs.wav'
                               ... and 4 more
                              }
                     Folders: {
                              ' .../Bdoc24b.2679053/build/runnable/matlab/toolbox/audio/samples'
                              }
                      Labels: {'A'; 'A'; 'A' ... and 4 more}
    AlternateFileSystemRoots: {}
              OutputDataType: 'double'
           OutputEnvironment: 'cpu'
      SupportedOutputFormats: ["wav"    "flac"    ...    ] (1x7 string)
         DefaultOutputFormat: "wav"


ADS2 = 

  audioDatastore with properties:

                       Files: {
                              ' .../build/runnable/matlab/toolbox/audio/samples/Heli_16ch_ACN_SN3D.wav';
                              ' .../matlab/toolbox/audio/samples/JetAirplane-16-11p025-mono-16secs.wav';
                              ' .../runnable/matlab/toolbox/audio/samples/Laughter-16-8-mono-4secs.wav'
                              }
                     Folders: {
                              ' .../Bdoc24b.2679053/build/runnable/matlab/toolbox/audio/samples'
                              }
                      Labels: {'A'; 'A'; 'A'}
    AlternateFileSystemRoots: {}
              OutputDataType: 'double'
           OutputEnvironment: 'cpu'
      SupportedOutputFormats: ["wav"    "flac"    ...    ] (1x7 string)
         DefaultOutputFormat: "wav"

Equivalently, you can split only the A label by excluding the B label.

[ADS1,ADS2] = splitEachLabel(ADS,0.7,'Exclude','B')
ADS1 = 

  audioDatastore with properties:

                       Files: {
                              ' .../matlab/toolbox/audio/samples/Ambiance-16-44p1-mono-12secs.wav';
                              ' .../matlab/toolbox/audio/samples/AudioArray-16-16-4channels-20secs.wav';
                              ' .../toolbox/audio/samples/ChurchImpulseResponse-16-44p1-mono-5secs.wav'
                               ... and 4 more
                              }
                     Folders: {
                              ' .../Bdoc24b.2679053/build/runnable/matlab/toolbox/audio/samples'
                              }
                      Labels: {'A'; 'A'; 'A' ... and 4 more}
    AlternateFileSystemRoots: {}
              OutputDataType: 'double'
           OutputEnvironment: 'cpu'
      SupportedOutputFormats: ["wav"    "flac"    ...    ] (1x7 string)
         DefaultOutputFormat: "wav"


ADS2 = 

  audioDatastore with properties:

                       Files: {
                              ' .../build/runnable/matlab/toolbox/audio/samples/Heli_16ch_ACN_SN3D.wav';
                              ' .../matlab/toolbox/audio/samples/JetAirplane-16-11p025-mono-16secs.wav';
                              ' .../runnable/matlab/toolbox/audio/samples/Laughter-16-8-mono-4secs.wav'
                              }
                     Folders: {
                              ' .../Bdoc24b.2679053/build/runnable/matlab/toolbox/audio/samples'
                              }
                      Labels: {'A'; 'A'; 'A'}
    AlternateFileSystemRoots: {}
              OutputDataType: 'double'
           OutputEnvironment: 'cpu'
      SupportedOutputFormats: ["wav"    "flac"    ...    ] (1x7 string)
         DefaultOutputFormat: "wav"

Specify the file path to the audio samples included with Audio Toolbox™. Create an audio datastore that points to the specified folder.

folder = fullfile(matlabroot,'toolbox','audio','samples');
ADS = audioDatastore(folder)
ADS = 
  audioDatastore with properties:

                       Files: {
                              ' .../matlab/toolbox/audio/samples/Ambiance-16-44p1-mono-12secs.wav';
                              ' .../matlab/toolbox/audio/samples/AudioArray-16-16-4channels-20secs.wav';
                              ' .../toolbox/audio/samples/ChurchImpulseResponse-16-44p1-mono-5secs.wav'
                               ... and 36 more
                              }
                     Folders: {
                              ' .../Bdoc24b.2679053/build/runnable/matlab/toolbox/audio/samples'
                              }
    AlternateFileSystemRoots: {}
              OutputDataType: 'double'
           OutputEnvironment: 'cpu'
                      Labels: {}
      SupportedOutputFormats: ["wav"    "flac"    "ogg"    "opus"    "mp3"    "mp4"    "m4a"]
         DefaultOutputFormat: "wav"

Create a label table with two variables:

  • containsMusic -- Can be either true or false.

  • instrument -- Can be Guitar, Drums, or Unknown.

containsGuitar = contains(ADS.Files,'guitar','IgnoreCase',true);
containsDrums = contains(ADS.Files,'drum','IgnoreCase',true);
containsMusic = or(containsGuitar,containsDrums);

instrument = strings(size(ADS.Files));
instrument(:) = "Unknown";
instrument(containsGuitar) = "Guitar";
instrument(containsDrums) = "Drums";

Assign the label table to the Labels property of audio datastore to associate the rows of the label table with the rows of the datastore. Call countEachLabel to determine the incidences of containsMusic and instrument.

labels = table(containsMusic,instrument);
ADS.Labels = labels;

containsMusicCount = countEachLabel(ADS,'TableVariable','containsMusic')
containsMusicCount=2×2 table
    containsMusic    Count
    _____________    _____

        false         32  
        true           7  

instrumentCount = countEachLabel(ADS,'TableVariable','instrument')
instrumentCount=3×2 table
    instrument    Count
    __________    _____

     Drums          4  
     Guitar         3  
     Unknown       32  

Split the datastore ADS into two, based on whether the audio file contains music. ADS1 contains 70% of the audio files that contain music, and ADS2 contains the rest. Call countEachLabel to verify that the ratio of containsMusic == true to containsMusic == false is preserved for the new datastores, within rounding.

[ADS1,ADS2] = splitEachLabel(ADS,0.7,'TableVariable','containsMusic');
ADS1_containsMusicCount = countEachLabel(ADS1,'TableVariable','containsMusic')
ADS1_containsMusicCount=2×2 table
    containsMusic    Count
    _____________    _____

        false         22  
        true           5  

ADS2_containsMusicCount = countEachLabel(ADS2,'TableVariable','containsMusic')
ADS2_containsMusicCount=2×2 table
    containsMusic    Count
    _____________    _____

        false         10  
        true           2  

Split the datastore ADS into two, based on the type of instrument present in the audio file. ADS3 contains 25% of the audio files that have an instrument label, and ADS4 contains the rest. Call countEachLabel to verify that the ratio of instrument == "drums" to instrument == "guitar" is preserved for the new datastores, within rounding.

[ADS3,ADS4] = splitEachLabel(ADS,0.25,'TableVariable','instrument');
ADS3_instrumentCount = countEachLabel(ADS3,'TableVariable','instrument')
ADS3_instrumentCount=3×2 table
    instrument    Count
    __________    _____

     Drums          1  
     Guitar         1  
     Unknown        8  

ADS4_instrumentCount = countEachLabel(ADS4,'TableVariable','instrument')
ADS4_instrumentCount=3×2 table
    instrument    Count
    __________    _____

     Drums          3  
     Guitar         2  
     Unknown       24  

Specify the file path to the audio samples included with Audio Toolbox™. Create an audio datastore that points to the specified folder.

folder = fullfile(matlabroot,'toolbox','audio','samples');
ADS = audioDatastore(folder);

Create a label table with two variables:

  • containsMusic - Can be either true or false.

  • instrument - Can be Guitar, Drums, or Unknown.

containsGuitar = contains(ADS.Files,'guitar','IgnoreCase',true);
containsDrums = contains(ADS.Files,'drum','IgnoreCase',true);
containsMusic = or(containsGuitar,containsDrums);

instrument = strings(size(ADS.Files));
instrument(:) = "Unknown";
instrument(containsGuitar) = "Guitar";
instrument(containsDrums) = "Drums";

Assign the label table to the Labels property of audio datastore to associate the rows of the label table with the rows of the datastore. Call countEachLabel to determine the incidences of containsMusic and instrument.

labels = table(containsMusic,instrument);
ADS.Labels = labels;

containsMusicCount = countEachLabel(ADS,'TableVariable','containsMusic')
containsMusicCount=2×2 table
    containsMusic    Count
    _____________    _____

        false         32  
        true           7  

instrumentCount = countEachLabel(ADS,'TableVariable','instrument');

Split the datastore ADS into two, based on whether the audio file contains music. ADS1 contains 5 of each label under the table variable containsMusic, and ADS2 contains the rest. Call countEachLabel to verify.

[ADS1,ADS2] = splitEachLabel(ADS,5,'TableVariable','containsMusic');
ADS1_containsMusicCount = countEachLabel(ADS1,'TableVariable','containsMusic')
ADS1_containsMusicCount=2×2 table
    containsMusic    Count
    _____________    _____

        false          5  
        true           5  

ADS2_containsMusicCount = countEachLabel(ADS2,'TableVariable','containsMusic')
ADS2_containsMusicCount=2×2 table
    containsMusic    Count
    _____________    _____

        false         27  
        true           2  

Split the datastore ADS into two, based on the type of instrument present in the audio file. ADS3 contains 2 of each label under the table variable instrument, and ADS4 contains the rest. Call countEachLabel to verify.

[ADS3,ADS4] = splitEachLabel(ADS,2,'TableVariable','instrument');
ADS3_instrumentCount = countEachLabel(ADS3,'TableVariable','instrument')
ADS3_instrumentCount=3×2 table
    instrument    Count
    __________    _____

     Drums          2  
     Guitar         2  
     Unknown        2  

ADS4_instrumentCount = countEachLabel(ADS4,'TableVariable','instrument')
ADS4_instrumentCount=3×2 table
    instrument    Count
    __________    _____

     Drums          2  
     Guitar         1  
     Unknown       30  

Input Arguments

collapse all

Input audio datastore, specified as an audioDatastore object.

Proportion of files to split, specified as a scalar in the interval (0,1), or a positive integer scalar.

If p is in the interval (0,1), it represents the percentage of the files from each label to assign to ADS1. If p represents a percentage, and it does not result in a whole number, then splitEachLabel rounds down to the nearest whole number.

If p is an integer, it represents the absolute number of files from each label to assign to ADS1. When p represents a number of files, there must be at least p files associated with each label.

Data Types: double

List of proportions, specified as scalars in the interval (0,1) or positive integer scalars.

If the proportions are in the interval (0,1), they represent the percentage of the files from each label to assign to the output datastores. When the proportions represent percentages, their sum must be no more than 1.

If the proportions are integers, they indicate the absolute number of files from each label to assign to the output datastores. When the proportions represent numbers of files, there must be enough files associated with each label to satisfy each proportion.

Data Types: double

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: [ADS1,ADS2] = splitEachLabel(ADS,0.5,'Exclude','noisy')

Labels to include, specified as the comma-separated pair consisting of 'Include' and a vector, cell array, or string array of label names with the same type as the Labels property. Each name must match one of the labels in the Labels property of the datastore.

This option cannot be used with the 'Exclude' option.

Labels to exclude, specified as the comma-separated pair consisting of 'Exclude' and a vector, cell array, or string array of label names with the same type as the Labels property. Each name must match one of the labels in the Labels property of the datastore.

This option cannot be used with the 'Include' option.

Table variable name, specified as the comma-separated pair consisting of 'TableVariable' and a character vector or string. When the Labels property of the audio datastore ADS is a table, you must use 'TableVariable' to specify which label you are using to split.

Data Types: char | string

Output Arguments

collapse all

Output audio datastores, returned as audioDatastore objects. ADS1 contains the specified proportion of files from each label in ADS, and ADS2 contains the remaining files.

List of output audio datastores, returned as audioDatastore objects. The number of elements in the list is one more that the number of listed proportions. Each of the new datastores contains the proportion of each label in ADS defined by p1,…,pN. Any files left over are assigned to the Mth datastore.

Version History

Introduced in R2018b