How to remove outliers from 2D array

Question

0 개 추천

I have been trying to solve a simple problem for a while now and can't seem to succeed other than brute force method.

I have a 2D array. I want to do statistics on it (i.e., compute mean and std dev). However, there are occassionally invalid values in the array (say below threshold1 and above threshold2). I'd like to either replace those values with null's which will make mean and std ignore them or some other method to ignore them.

For instance, consider: a = reshape(rand(100,1),25,4); a(a>0.9) = 10; a(a<0.1) = -10;

I would like to then compute things like: b = mean(a,2);

but exclude the elements > 1 and < 0 in the computation. If I could exclude them, then elements of b would be averages of 0 to 4 numbers.

Using things like: a(a<0.1) = [];

doesn't work because it turns the 2D array into 1D which can't be reshaped back to original format.

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

활동을 팔로우하려면 로그인

Answer 1

Jim Hokanson 2013년 7월 25일

0 개 추천

Replace invalid values with NaN.

You can then use the function nanmean with the stats toolbox or there is a FEX posting with a similar function.

http://www.mathworks.com/matlabcentral/fileexchange/6837-nan-suite/content/nansuite/nanmean.m

Good luck

댓글 수: 1
이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

Varoujan 2013년 7월 25일

Thank you for your suggestion - I was aware of nanmean in Stat toolbox but I don't have it. Didn't realize someone posted the nansuite on the file exchange. That solved my problem.

댓글을 달려면 로그인하십시오.

Answer 2

Andrei Bobrov 2013년 7월 25일

MATLAB Online에서 열기

0 개 추천

l1 = a <= 1 & a >= 0;
n = sum(l1,2)./n;
mn = sum(a.*l1,2)./n;
sd = sqrt(sum((bsxfun(@minus,a,mn).*l1).^2,2)./n);

OR, if you have Statistics Toolbox

a1 = a;
a1(~l1) = nan;
mn2 = nanmean(a1,2);
sd2 = nanstd(a1,1,2);

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

Answer 3

Varoujan 2013년 7월 25일

MATLAB Online에서 열기

0 개 추천

Thanks to suggestions by Jim and Andrei, I now have a solution to my problem. The code below illustrates the solution:

% create a test array a and duplicate array d
% leave first column alone - it's the index axis
% Replace outliers in columns 2:4 with NaNs
% Then use nanmean from Mathworks File Exchange
a = reshape(rand(100,1),25,4); a(a>0.9) = 10; a(a<0.1) = -10; d = a;
logic1 = or(a(:,2:4) < 0, a(:,2:4) > 1);
b = a(:,2:4); b(logic1) = NaN; c = [a(:,1), b];
ans1 = nanmean(c(:,3));
logic2 = [false(size(a,1),1) , logic1];
d = a; d(logic2) = NaN;
ans2 = nanmean(d(:,3));

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

Answer 4

Maziyar 2015년 7월 28일

0 개 추천

I think it would be better if you replace outliers with the mean value of the matrix. This is more accurate from statistical point of view than ignoring outliers. However it might increase the running time.

for i = 1 : numel(Matrix) if Matrix(i) > mean2(Matrix) Mask(i) = mean2(Matrix) ; end end

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

How to remove outliers from 2D array

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 1
이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

추가 답변 (3개)

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

카테고리

태그

Community Treasure Hunt

How to remove outliers from 2D array

댓글 수: 0 이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 1 이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

추가 답변 (3개)

댓글 수: 0 이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글 수: 0 이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글 수: 0 이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

카테고리

태그

참고 항목

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글 수: 1
이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기