Replacing certain values in table with NaN based on different conditions

Hello everyone,
I'm basically new to Matlab and seem to be at a loss with my current problem - so any help would be greatly appreciated.
I have some data stored in a table called T that is 724 rows x 365 variables in size.
Now for some variables, I need to exclude extreme values based on different conditions before continuing on with my statistical analysis. But since these don't warrant excluding the whole dataset (row), I would like to replace those values with NaN in my table T.
As of now, this concerns only the variable in column 129, were values should be replaced with NaN if they are equal to or less than 3.1006 as well as column 131 were values should be replaced with NaN if they are equal to or less than 4.8673.
I have tried to get this to work for hours with several variations of logical indexing and am now stuck at this:
extremevalues = (T{:,129} <= 3.1006) | (T{:,131} <= 4.8673);
T(extremevalues) = NaN;
This does not give me any errors - which is the farthest I've come so far - however, it does not replace those extreme value in the respective columns with NaNs (there are no NaNs to be seen in either one of the specified columns in T, extremevalues however gives me several "1"s when quickly scrolling through it).
I've also tried substituting T{:, variablenr} with T.variablename - that didn't change anything either though.
I feel like the solution should be pretty easy, but I can't seem to find it (or am simply not experienced enough to find it).
Thanks very much in advance!

답변 (2개)

Maybe you need to convert your data into matrix first
A = randi(10,5,4)
A = 5×4 table
Var1 Var2 Var3 Var4 ____ ____ ____ ____ 7 8 8 2 1 2 1 5 3 7 1 7 1 5 5 7 2 10 8 4
ind = A(:,1) < 5; % have this error when using 'table'
Operator '<' is not supported for operands of type 'table'.
A(ind,:) = nan

댓글 수: 1

Liza M
Liza M 2021년 8월 17일
편집: Liza M 2021년 8월 17일
I think that would've worked if my table weren't of a mixed data type (there are other variables besides numeric ones in there as well).
Anyways, I just discovered the rmoutliers function, which does the thing I need with the specified variables..in just a lot less complicated way than what I've tried.
Thanks again regardless for your quick help!

댓글을 달려면 로그인하십시오.

For table, you need to directly access the variable names/column indices. so, T(extremevalues) = NaN; won't work.
This may help:
t = splitvars(table(reshape(1:20, 5, 4)))
t = 5×4 table
Var1_1 Var1_2 Var1_3 Var1_4 ______ ______ ______ ______ 1 6 11 16 2 7 12 17 3 8 13 18 4 9 14 19 5 10 15 20
% set values below 8 in column 2 to nan
colN = 2;
cutoff = 8;
t.(colN)(t.(colN) < cutoff) = nan
t = 5×4 table
Var1_1 Var1_2 Var1_3 Var1_4 ______ ______ ______ ______ 1 NaN 11 16 2 NaN 12 17 3 8 13 18 4 9 14 19 5 10 15 20

댓글 수: 1

Hi,
so rmoutliers solved my problem this time around, but this is a good tip for the future, thanks for that!

댓글을 달려면 로그인하십시오.

카테고리

도움말 센터File Exchange에서 Tables에 대해 자세히 알아보기

제품

릴리스

R2021a

태그

질문:

2021년 8월 16일

댓글:

2021년 8월 17일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by