Main Content

Human Activity Recognition Using Signal Feature Extraction and Machine Learning

This example shows how to extract features from smartphone accelerometer signals to classify human activity using a machine learning algorithm. The feature extraction for the data is done using the signalTimeFeatureExtractor and signalFrequencyFeatureExtractor objects. The features are used to train a support vector machine (SVM) model.

Data Set

The Sensor HAR (human activity recognition) App [1] was used to collect raw accelerometer signals in [2]. The smartphone was worn by a subject during five different types of physical activity. The data set was then buffered to obtain 44 sample-long signals corresponding to a particular activity. The Dancing activity from the data set and the accelerometer signals in the y and z directions were excluded to create the BufferedHumanActivity data set stored in the BufferedHumanActivity.mat file used in this example.

Load the BufferedHumanActivity data set.

load BufferedHumanActivity.mat

The data set contains 7776 x-direction accelerometer signals. Each signal has a duration of 44 samples and corresponds to one of four different physical human activities: Sitting, Standing, Walking and Running. The data set contains the following variables:

  • atx — Buffered x-direction accelerometer sensor data of fixed length (44 by 7776 matrix)

  • actid — Response vector containing the activity IDs in integers: 1, 2, 3, and 4 representing Sitting, Standing, Walking and Running, respectively

  • actnames — List of activity names for each activity ID

  • fs — Sample rate of accelerometer sensor data

Feature Extraction

The accelerometer signals may be thought of as containing two main components, one consisting of "fast" variations over time caused by body dynamics (physical movements of the subject). The other consisting of "slow" variations over time caused by the position of the body with respect to the vertical gravitational field.

To isolate the rapid signal variations from the slower ones, we apply a high pass filter to the original signals. We extract different features from the filtered and unfiltered signals using the signalTimeFeatureExtractor and signalFrequencyFeatureExtractor objects. These objects allow performant computation of multiple features in the time and frequency domains with one function call.

% Filter the signals with a highpass filter
atxFiltered = highpass(atx,0.7,fs);

For time features, two signalTimeFeatureExtractor objects are configured. One is used to extract the mean of the unfiltered signals (meanFE) and the second is used to extract the root mean square, shape factor, peak value, crest factor, clearance factor, and impulse factor of the filtered signals (timeFE).

meanFE = signalTimeFeatureExtractor("Mean",true,"SampleRate",fs);
timeFE = signalTimeFeatureExtractor("RMS",true,...
    "ShapeFactor",true,...
    "PeakValue",true,...
    "CrestFactor",true,...
    "ClearanceFactor",true,...    
    "ImpulseFactor",true,...
    "SampleRate",fs);

For frequency features, signalFrequencyFeatureExtractor is used to extract the mean frequency, band power, half-power bandwidth, peak amplitude and peak location of the filtered signals.

freqFE = signalFrequencyFeatureExtractor("PeakAmplitude",true,...
    "PeakLocation",true,...
    "MeanFrequency",true,...
    "BandPower",true,...
    "PowerBandwidth",true,...
    "SampleRate",fs);

The computation of spectral peaks can be refined by setting more parameters. For instance, the maximum number of peaks is set to 6, and the minimum distance between each spectral peak is set to 0.25Hz. Additionally, we choose an FFT length of 256 and a rectangular window of 44 samples (i.e., the signal length) to compute the spectral estimates.

fftLength = 256;
window = rectwin(size(atx,1));
setExtractorParameters(freqFE,"WelchPSD","FFTLength",fftLength,"Window",window);
mindist_xunits = 0.25;
minpkdist = floor(mindist_xunits/(fs/fftLength));
setExtractorParameters(freqFE,"PeakAmplitude","MaxNumExtrema",6,"MinSeparation",minpkdist);
setExtractorParameters(freqFE,"PeakLocation","MaxNumExtrema",6,"MinSeparation",minpkdist);

The computation of features for all the signals can be parallelized using transformed array datastores. The datastores read each matrix column and compute features using the extract function of the feature extractor objects.

meanFeatureDs = arrayDatastore(atx,"IterationDimension",2);
meanFeatureDs = transform(meanFeatureDs,@(x)meanFE.extract(x{:}));
timeFeatureDs = arrayDatastore(atxFiltered,"IterationDimension",2);
timeFeatureDs = transform(timeFeatureDs,@(x)timeFE.extract(x{:}));
freqFeatureDs = arrayDatastore(atxFiltered,"IterationDimension",2);
freqFeatureDs = transform(freqFeatureDs,@(x)freqFE.extract(x{:}));

Call the readall method of transformed datastore with the "UseParallel" option set to true to distribute the computations across a pool of workers if Parallel Computing Toolbox is installed. The resulting computed features are combined to end up with 22 features for each one of the 7776 signal observations.

meanFeatures = readall(meanFeatureDs,"UseParallel",true);
Starting parallel pool (parpool) using the 'Processes' profile ...
Connected to parallel pool with 8 workers.
timeFeatures = readall(timeFeatureDs,"UseParallel",true);
freqFeatures = readall(freqFeatureDs,"UseParallel",true);

features = [meanFeatures timeFeatures freqFeatures];

Train an SVM Classifier Using Extracted Features

You can import the features and activity labels into the Classification Learner app to train an SVM classifier. Alternatively, you can create an SVM template and classifier using a feature table containing the features (predictors) and the activity labels (response) as follows.

First create a table with predictors and response.

featureTable = array2table(features);
actioncats = categorical(actnames)';
featureTable.ActivityID = actioncats(actid);
head(featureTable)
    features1    features2    features3    features4    features5    features6    features7    features8    features9    features10    features11    features12    features13    features14    features15    features16    features17    features18    features19    features20    features21    features22    ActivityID
    _________    _________    _________    _________    _________    _________    _________    _________    _________    __________    __________    __________    __________    __________    __________    __________    __________    __________    __________    __________    __________    __________    __________

     -73.408      0.10678      1.2695       0.24501      2.2946       3.5282        2.913       3.1208       0.011402     0.22658      0.0037348     0.0043388      0.0049913      0.014314    0.0032949     0.0042457      0.74219        1.6797        3.2031        3.8281        4.2188        4.5703       Sitting  
      -73.43      0.06735      1.2521       0.13304      1.9753       2.9553       2.4733       1.8959       0.004536     0.18083      0.0078615      0.001071      0.0046553     0.0023938    0.0017709      0.002039      0.74219          1.25        1.5625        2.3438        3.5938        3.9453       Sitting  
      -73.41       0.0626       1.303       0.15569       2.487       3.9603       3.2407       2.4191      0.0039188     0.18783      0.0036916      0.001265     0.00086816    0.00098286    0.0029621     0.0044119      0.74219        1.4062        2.2266        2.7734        3.0859        4.6094       Sitting  
     -73.393     0.072056      1.3457       0.20023      2.7788       4.6384       3.7394       2.9361       0.005192     0.21444      0.0028194     0.0016623      0.0028484     0.0018433     0.003666     0.0026144      0.89844        1.6797        2.3047        3.2422        4.0234        4.6484       Sitting  
     -73.409     0.080133      1.3786       0.21548       2.689        4.602       3.7069       3.2548      0.0064214      0.2053      0.0035392     0.0015361      0.0061205     0.0010848    0.0072086     0.0055945       1.5625        2.3828        3.0469        3.5156        3.8672        4.6484       Sitting  
      -73.43     0.071148      1.1902       0.13832      1.9441       2.6268       2.3139       3.0519      0.0050621     0.25175      0.0022982     0.0027692      0.0040954     0.0045089    0.0016846      0.003589      0.82031        2.3047        3.1641        3.9062        4.2188        4.5312       Sitting  
     -73.441     0.091667       1.169       0.19139      2.0879       2.7515       2.4408       2.8127      0.0084028     0.25907      0.0021497     0.0029254      0.0035706     0.0018514     0.015439     0.0030516      0.89844        2.1094        2.3828        2.6562        3.0859        4.5703       Sitting  
     -73.419      0.10858      1.1976       0.20506      1.8886       2.5625       2.2619       2.3954       0.011789     0.17288       0.010823     0.0088772      0.0078451     0.0071845    0.0066219     0.0024052      0.74219        1.5625        2.2656        3.0469        3.8281        4.5312       Sitting  

Partition the dataset by assigning 75% of the signals for training and 25% for testing. Use the cvpartition function to ensure the partitions contain activity labels with similar proportions.

% Extract predictors and response
predictors = featureTable(:, 1:end-1);
response = featureTable.ActivityID;

% For reproducible results
rng default

% Partition the data and extract training predictors and response data
cvp = cvpartition(response,'Holdout',0.25);
trainingPredictors = predictors(cvp.training, :);
trainingResponse = response(cvp.training, :);

% Train the classifier
template = templateSVM(...
    'KernelFunction', 'polynomial', ...
    'PolynomialOrder', 2, ...
    'KernelScale', 'auto', ...
    'BoxConstraint', 1, ...
    'Standardize', true);
classificationSVM = fitcecoc(...
    trainingPredictors, ...
    trainingResponse, ...
    'Learners', template, ...
    'Coding', 'onevsone', ...
    'ClassNames',actioncats);

Test the classifier on the test partition and analyze its classification accuracy.

% Extract test predictors and response data
testPredictors = predictors(cvp.test, :);
testResponse = response(cvp.test, :);

% Predict activity on the test data
testPredictions = predict(classificationSVM,testPredictors);

% Plot the confusion matrix to analyze performance of the classifier
figure
cm = confusionchart(testResponse, testPredictions, ...
    ColumnSummary="column-normalized", RowSummary="row-normalized");

accuracy = trace(cm.NormalizedValues)/sum(cm.NormalizedValues, "all");
fprintf("The classification accuracy on the test partition is %2.1f%%", accuracy*100)
The classification accuracy of the classifier on the test partition is 95.0%

Most of the errors occur when misclassifying running as walking and standing as sitting.

Summary

In this example, you saw how to extract features for human activity based on smartphone sensor signals using signalTimeFeatureExtractor and signalFrequencyFeatureExtractor. You saw how to use the extracted features to train an SVM model which resulted in about 95% accuracy. As an alternative approach, you can also explore using a featureInput layer to train a deep learning classifier.

References

[1] El Helou, A. Sensor HAR recognition App. MathWorks File Exchange https://www.mathworks.com/matlabcentral/fileexchange/54138-sensor-har-recognition-app

[2] El Helou, A. Sensor Data Analytics. MathWorks File Exchange https://www.mathworks.com/matlabcentral/fileexchange/54139-sensor-data-analytics-french-webinar-code

See Also

Apps

Functions