필터 지우기
필터 지우기

How can I replace outliers with 2 standard deviation from the mean

조회 수: 16 (최근 30일)
AN NING
AN NING 2021년 2월 6일
편집: dpb 2021년 2월 8일
How can I replace outliers with 2 standard deviation from the mean?
This script is to replace with mean, but not 2std from the mean, anyone could help to modify this?
A=filloutliers(A,'center','mean','ThresholdFactor', 2);% replace 2 std away from mean with mean

채택된 답변

dpb
dpb 2021년 2월 6일
편집: dpb 2021년 2월 8일
Can't do it with filloutliers alone; it for some reason doesn't have the facility to use a function handle as a fill option...
One way amongst many but one that keeps filloutliers in the code--
A=filloutliers(A,'center',nan,'ThresholdFactor', 2); % step 1: replace 2 std away from mean with NaN
A(isnan(A))=2*std(A); % step 2: replace NaN w/ 2*std()
Does seem like a reasonable enhancement request to be able to use a function handle in filloutliers
Alternatively,
mnA=mean(A); sdA=std(A);
Z=(A-mnA)/sdA;
isOut=(abs(Z)>=2);
A(isOut)=mnA+sign(Z(isOut))*2*sdA;
ADDENDUM:
To consolidate in one place, with the added caveat the data are 2D by column, the above must be extended as follows:
Since mnA and sdA are now row vectors of column statistics, Z needs the "dot" division operator to be element-wise:
Z=(A-mnA)./sdA;
and have to apply the outlier calculation by column as well. It's probably just as quick here to write the explicit loop as:
for i=1:size(A,2)
A(isOut(:,i))= mnA(i)+sign(Z(isOut(:,i)))*2*sdA(i);
end
This way each column is a vector so the size of the logical elements selected will match and the mean and std dev are constants for the column instead of arrays.
  댓글 수: 9
AN NING
AN NING 2021년 2월 8일
Thank you for replying.
The matrix A represent different variables by column, so I need to replace the outliers in each column variable with the 2 standard deviation of the mean in that columb variable. How can I solve this problem then?
dpb
dpb 2021년 2월 8일
Shoulda' told us that going in... :)
As written above, then need two changes -- since mnA and sdA are now row vectors of column statistics, Z needs the "dot" division operator to be element-wise:
Z=(A-mnA)./sdA;
and have to apply the outlier calculation by column as well. It's probably just as quick here to write the explicit loop as:
for i=1:size(A,2)
A(isOut(:,i))= mnA(i)+sign(Z(isOut(:,i)))*2*sdA(i);
end
This way each column is a vector so the size of the logical elements selected will match and the mean and std dev are constants for the column instead of arrays.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 MATLAB Coder에 대해 자세히 알아보기

태그

제품


릴리스

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by