about high dimension and low sample using PCA
조회 수: 1 (최근 30일)
이전 댓글 표시
I am using PCA to detect the abnormality in time-series data. Currently I have high dimension and low sample dataset (15*530 data matrix). I am wondering if I can use PCA to obtain the statistic such as T^2 and SPE. I noticed that some articles stated that it is improper to use PCA to obtain the statistics under such case.
댓글 수: 0
답변 (1개)
Aditya
2024년 6월 27일
Using PCA to detect abnormalities in time-series data, especially with a high-dimensional and low-sample dataset, can be challenging. The primary concern is that PCA may not provide reliable results when the number of features (dimensions) significantly exceeds the number of samples. This is because PCA relies on the covariance matrix, which can be poorly estimated in such scenarios.
% Simulate high-dimensional, low-sample data
rng(0);
DATASET = rand(15, 530);
% Apply PCA
[coeff, score, latent] = pca(DATASET);
% Calculate T² statistic
T2 = sum((score ./ sqrt(latent')).^2, 2);
% Calculate SPE (Q-statistic)
reconstructed = score * coeff';
SPE = sum((DATASET - reconstructed).^2, 2);
% Set threshold for T² and SPE (e.g., 95% confidence level)
alpha = 0.05;
T2_threshold = chi2inv(1 - alpha, size(coeff, 2));
SPE_threshold = prctile(SPE, 95);
% Detect abnormalities
abnormal_T2 = T2 > T2_threshold;
abnormal_SPE = SPE > SPE_threshold;
disp('Abnormalities detected by T²:');
disp(abnormal_T2);
disp('Abnormalities detected by SPE:');
disp(abnormal_SPE);
댓글 수: 0
참고 항목
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!