Replace NaNs with previous values

Hello,
I have the following problem. I like to replace NaNs with the previous values.
A =
4 5 6 7 8
32 NaN NaN 21 NaN
12 NaN 12 NaN NaN
34 NaN NaN NaN NaN
B =
4 5 6 7 8
32 5 6 21 8
12 5 12 21 8
34 5 12 21 8
I sloved it like this:
for i = 2:5
[r,c] = find(isnan(A(:,i)));
while sum(isnan(A(:,i)))>0
A(r,i) = A(r-1,i);
end
end
I'm sure there is a way avoiding the for and the while statement. I search for an "elegant" solution.
Someone's able to help me?

댓글 수: 2

Matt Fig
Matt Fig 2012년 10월 9일
What if a whole column is nan? Which value will fill it?
Johannes
Johannes 2012년 10월 9일
If the first value is NaN, everything should be NaN untill a different value appears in the column.
Thanks, Johannes

댓글을 달려면 로그인하십시오.

답변 (5개)

Moshe Flam
Moshe Flam 2017년 12월 3일
편집: Moshe Flam 2017년 12월 4일

5 개 추천

Use `fillmissing` According to this matlab documentation (click here) on their website.
ROWBYROW = 2;
B = fillmissing(A,'previous',ROWBYROW);

댓글 수: 2

Rasoul Soufi Noughabi
Rasoul Soufi Noughabi 2020년 9월 16일
nice!
Namrata Goswami
Namrata Goswami 2020년 12월 11일
편집: Namrata Goswami 2020년 12월 11일
This worked for me partially, since I need to replace missing values withing group. How to use fillmising within a group, like with splitapply ?

댓글을 달려면 로그인하십시오.

Matt Fig
Matt Fig 2012년 10월 9일
편집: Matt Fig 2012년 10월 9일

4 개 추천

Johannes, notice that your solution will fail if the first value in a column is nan. Rather than looking for a vectorized solution that may end up being rather convoluted (and being slower!), I would simply write a good FOR loop function that can handle all cases. For example, the following solution does not use the FIND function, and only uses simple loops and thus should be very fast:
function A = fill_nans(A)
% Replaces the nans in each column with
% previous non-nan values.
for ii = 1:size(A,2)
I = A(1,ii);
for jj = 2:size(A,1)
if isnan(A(jj,ii))
A(jj,ii) = I;
else
I = A(jj,ii);
end
end
end

댓글 수: 7

owr
owr 2012년 10월 9일
This is really nice, readable and makes sense. I especially like the fact that you were able to implement it as an in-place function. In a couple quick tests, a "find" based solution doesnt seem to be any worse performance wise, but I still think I like this better because it is really clean. I may use it for myself, thanks for sharing!
Mohammad Sayeed
Mohammad Sayeed 2014년 1월 14일
Hi I tried to apply your codes but it showed following error: Error using fill_nans (line 4) Not enough input arguments. Can you please tell me how can I correct it?
Jakob Hannibal
Jakob Hannibal 2014년 11월 16일
This is a great little script! I want to replace with the next valid measurement instead of the previous... Any good ideas?
@Jakob: Simply replace the loops, wuch that run the other way around:
for ii = size(A,2):-1:1
Jakob Hannibal
Jakob Hannibal 2014년 11월 16일
Yes, I thought about that. But I tried to use flipud before the loop and then reverse the flip after the operation. I think it works too! Thanks for feedback!!
Timothy Jackson
Timothy Jackson 2016년 4월 1일
Is there a way to do this both before and after values? For instance changing
A= NaN NaN 2 4 8 NaN NaN to A= 2 2 2 4 8 8 8 ?
Faez Alkadi
Faez Alkadi 2017년 5월 1일
Good question Timothy Jackson. I hope someone can answer this

댓글을 달려면 로그인하십시오.

Wayne King
Wayne King 2012년 10월 9일
편집: Wayne King 2012년 10월 9일

0 개 추천

How about:
A = [ 4 5 6 7 8
32 NaN NaN 21 NaN
12 NaN 12 NaN NaN
34 NaN NaN NaN NaN];
indices = isnan(A);
A(indices) = 0;
B = repmat([4 5 6 7 8],size(A,1),1);
A = A+B.*indices;

댓글 수: 1

Johannes comments:
"Solution there:
A =
4 5 6 7 8
32 5 6 21 8
12 5 12 7 8
34 5 6 7 8
Not good, would need the following: 4 5 6 7 8 32 5 6 21 8 12 5 12 21 8 34 5 12 21 8
Still thanks for you help!"

댓글을 달려면 로그인하십시오.

owr
owr 2012년 10월 9일

0 개 추천

I do this all the time, my code uses for loops, but I dont see anything wrong with for loops. Im sure there are more elegent solutions but this does the trick for me and is more than fast enough:
function datai = backfillnans(data)
% Dimensions
[numRow,numCol] = size(data);
% First, datai is copy of data
datai = data;
% For each column
for c = 1:numCol
% Find first non-NaN row
indxFirst = find(~isnan(data(:,c)),1,'first');
% Find all NaN rows
indxNaN = find(isnan(data(:,c)));
% Find NaN rows beyond first non-NaN
indx = indxNaN(indxNaN > indxFirst);
% For each of these, copy previous value
for r = (indx(:))'
datai(r,c) = datai(r-1,c);
end
end

댓글 수: 2

This seems to fail when a whole column of data is nan.
A = [25 NaN 54 99 20
3 NaN 92 74 89
7 NaN NaN NaN 82
75 NaN 43 65 77
NaN NaN 15 NaN 38]
Ah, good catch Matt, thanks for that. Ive been using this for almost 2 years multiple times a day and thats never come up - I guess I never have a full column of nans. It can be fixed I guess by putting an:
if( ~isempty(indxFirst) )
after the line that calculates "indxFirst". Part of me would actually like the whole process to fail so I can figure out why I passed a full column of nans in the first place - that would be symptomatic of a much bigger issue...
Anyways, thanks for taking the time to run and test the code.

댓글을 달려면 로그인하십시오.

0 개 추천

I found your procedure much more elegant and efficient. It was very helpful man.

카테고리

도움말 센터File Exchange에서 Structures에 대해 자세히 알아보기

제품

질문:

2012년 10월 9일

편집:

2020년 12월 11일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by