about high dimension and low sample using PCA

조회 수: 1 (최근 30일)
Yimin Chen
Yimin Chen 2016년 11월 8일
답변: Aditya 2024년 6월 27일
I am using PCA to detect the abnormality in time-series data. Currently I have high dimension and low sample dataset (15*530 data matrix). I am wondering if I can use PCA to obtain the statistic such as T^2 and SPE. I noticed that some articles stated that it is improper to use PCA to obtain the statistics under such case.

답변 (1개)

Aditya
Aditya 2024년 6월 27일
Using PCA to detect abnormalities in time-series data, especially with a high-dimensional and low-sample dataset, can be challenging. The primary concern is that PCA may not provide reliable results when the number of features (dimensions) significantly exceeds the number of samples. This is because PCA relies on the covariance matrix, which can be poorly estimated in such scenarios.
% Simulate high-dimensional, low-sample data
rng(0);
DATASET = rand(15, 530);
% Apply PCA
[coeff, score, latent] = pca(DATASET);
% Calculate T² statistic
T2 = sum((score ./ sqrt(latent')).^2, 2);
% Calculate SPE (Q-statistic)
reconstructed = score * coeff';
SPE = sum((DATASET - reconstructed).^2, 2);
% Set threshold for T² and SPE (e.g., 95% confidence level)
alpha = 0.05;
T2_threshold = chi2inv(1 - alpha, size(coeff, 2));
SPE_threshold = prctile(SPE, 95);
% Detect abnormalities
abnormal_T2 = T2 > T2_threshold;
abnormal_SPE = SPE > SPE_threshold;
disp('Abnormalities detected by T²:');
disp(abnormal_T2);
disp('Abnormalities detected by SPE:');
disp(abnormal_SPE);

카테고리

Help CenterFile Exchange에서 Dimensionality Reduction and Feature Extraction에 대해 자세히 알아보기

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by