How to replace some of the value in the matrix with NaN?

조회 수: 12 (최근 30일)
Isti
Isti 2012년 4월 21일
The simple case is like this:
2 1 4 6 2
9 4 6 1 2
5 3 2 8 3
7 2 1 9 3
7 1 8 2 4
From the matrix above, i want to insert 3 NaNs in random place. So, my code is like this:
Data = [2,1,4,6,2;9,4,6,1,2;5,3,2,8,3;7,2,1,9,3;7,1,8,2,4];
[rows,cols] = size(Data);
p = 3; %amount of NaN that will we inserted
r = randperm(25); %give the random value from range 1-25
r = r(1:3); %give 3 random number from range 1-25
i = 1;a = 1; b = 1;
while i <= 3 %generate every number in vektor r to be position where NaN is located
n = r(a,b);
b = b+1;
e = 1;
if n <= cols
Data(1,n) = NaN;
else
if n > cols
while n > cols
e = e+1;
k = n - cols;
n = k;
end
Data(e,n) = NaN;
end
end
i = i+1;
end
The output one of the output will be like this:
2 1 4 6 2
9 NaN NaN NaN 2
5 3 2 8 3
7 2 1 9 3
7 1 8 2 4
So, i want to make some constraint such as:
1. every row only can have 2 NaN
2. amount NaN in column 1 have to be less then column 2, and amount NaN in column 2 have to be less then column 3, and so on. eg. output matrix will be like this:
2 1 4 6 2
9 4 6 1 NaN
5 3 2 8 3
7 2 1 NaN 3
7 1 8 2 NaN
for matrix above we can see that:
amount NaN of column 1= 0, column 2=0, column 3=0, column 4=1, column 5= 2.
Somebody can help me to insert those my constraint into my code above? Or there willl be another solution i think.
Thanks before :')
  댓글 수: 2
per isakson
per isakson 2012년 4월 22일
Are you aware of the function, [I,J] = ind2sub(siz,IND)?
Isti
Isti 2012년 4월 22일
no i don't. actually i'm new in using matlab :(
could you help me more about that? or somehow it'll help me in my problem.

댓글을 달려면 로그인하십시오.

답변 (3개)

per isakson
per isakson 2012년 4월 22일
This is an idea that I have not tested!
jj = 0;
for ii = r
[rr,cc] = ind2sub( size(Data), ii )
if sum(isnan(Data(rr,:))>=2 || sum( isnan(Data(:,cc))>=2
% do nothing
else
Data(rr,cc)=nan;
jj = jj + 1;
if jj = 3, break
end
end
end
--- EDIT ---
The function below will return a result. The constraint is "no more than two NaN in any column or row. However, that was not what you asked for.
function Data = cssm
Data = [2,1,4,6,2;9,4,6,1,2;5,3,2,8,3;7,2,1,9,3;7,1,8,2,4];
p = 3; %amount of NaN that will we inserted
row_vector = randperm(numel(Data));
jj = 0;
for ii = row_vector
[rr,cc] = ind2sub( size(Data), ii );
if sum(isnan(Data(rr,:)))>=2 || sum( isnan(Data(:,cc)))>=2
% do nothing
else
Data(rr,cc)=nan;
jj = jj + 1;
if jj == p, break
end
end
end
end
With the constraint, "amount NaN in column 1 have to be less then column 2, and amount NaN in column 2 have to be less then column 3, and so on.", there is no solution. Do you exclude columns with zero NaN from that constraint?
Thus, (according to my reading) the last column can have two or three NaN and the second last column one or zero NaN. NaN cannot not appear in the other columns.
  댓글 수: 4
Isti
Isti 2012년 4월 22일
ooh, i think your suggestion code isn't fulfill my second constraint :(
Isti
Isti 2012년 4월 24일
of course not, the columns with zero NaN also included. and so when the column have zero NaN, it will in the very left column of the matrix.
btw, what's the used of ind2sub above. i can't get it yet

댓글을 달려면 로그인하십시오.


Richard Brown
Richard Brown 2012년 4월 22일
This is another one of these problems where the simplest way to solve it is to randomly generate candidates until you find one that fits:
A = reshape(randperm(25), 5, 5);
done = false;
while ~done
idx = randperm(25, 3);
[I, J] = ind2sub([5 5], idx);
m = hist(I, unique(I));
n = hist(J, unique(J));
done = all(m <= 2) && all(diff(n) >= 0);
end
A(idx) = NaN;
It's trivial (but a little messier) to make it more general, so I'll leave you to do that if you need to.
EDIT changed code to use randperm instead of randi - only one call to the random number generator is necessary
  댓글 수: 1
Isti
Isti 2012년 4월 28일
thanks for this answer. actually it works in my smal dataset. but, for my medium dataset (such 1500rows*11columns of data) and more amount of NaN to be insert, it takes very long time. and even i decided to cancel it :(
if i cut the 2nd constraint and only want to use the 1st constraint, is there any way to make it faster?
thanks before.

댓글을 달려면 로그인하십시오.


Richard Brown
Richard Brown 2012년 4월 29일
Here's a much faster method that satisfies both of your constraints. It may be possible to vectorise the loop, but it is, in my opinion, not worth the effort.
First, generate the data
X = rand(1500, 11);
[m,n] = size(X);
nNans = 2000;
We figure out the row and column indices separately. Rows is easy, a single call to randperm does the trick
I = mod(randperm(2*m, nNans), m) + 1;
Then figure out the column positions randomly, going row by row to avoid creating duplicate entries.
J = zeros(1, nNans);
k = 1;
for i = 1:m
idx = (I == i);
J(idx) = randperm(n, nnz(idx));
end
We now need to make sure the columns are ordered correctly. So we construct a logical matrix encoding the position of the NaN entries, and reorder the columns to satisfy your column constraint.
iNan = false(m, n);
iNan(sub2ind([m n], I, J)) = true;
[~, iSorted] = sort(hist(J, 1:n));
iNan = iNan(:, iSorted);
We now have a logical array with the right properties. Last step is to overwrite the entries of X
X(iNan) = nan;

카테고리

Help CenterFile Exchange에서 Loops and Conditional Statements에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by