Filling missing values of selected columns of a table with previous values

조회 수: 11 (최근 30일)
Sam17
Sam17 2019년 4월 23일
댓글: Sam17 2019년 4월 24일
I am trying to fill the NaNs from a selected columns of interest with previous values.
For example lets say i have a table Data as:
SN English Science Maths Health
1 100 80 89 74
2 NaN 85 NaN 89
3 NaN NaN 75 NaN
4 80 NaN NaN NaN
5 NaN 75 NaN 60
Lets say if i only want to fill with previous values for column English and Maths. How do i do it? I tried this line. I went to error as
Data=fillmissing(Data{:,'English','Science'},'previous');
You cannot subscript a table using linear indexing (one subscript) or multidimensional indexing (three or more subscripts). Use a
row subscript and a variable subscript.

답변 (3개)

Guillaume
Guillaume 2019년 4월 24일
Star's answer works (please don't delete it!) or you can use:
Data(:, {'English', 'Science'}) = fillmissing(Data(:, {'English', 'Science'}), 'previous');
You need to understand the difference between {} indexing (extracts content of a table as a cell array or matrix) and () indexing (extracts portion of the table, resut is still a table) for tables.

Star Strider
Star Strider 2019년 4월 23일
You most likely have to use fillmissing with each variable separately:
Data.English = fillmissing(Data.English,'previous');
Data.Science = fillmissing(Data.Science,'previous');
That works.
  댓글 수: 2
Sam17
Sam17 2019년 4월 24일
So I can't fill missing only on selected columns in a table?
Star Strider
Star Strider 2019년 4월 24일
You can, but apparently only one at a time, as in my code. I experimented with varfun, using an anonymous function with fillmissing, however varfun threw an error about anonymous functions (that made no sense to me, since the anonymous function worked outside of varfun).

댓글을 달려면 로그인하십시오.


Sam17
Sam17 2019년 4월 24일
So, I wrote few lines of Code. I wrote the variables I want to interpolate in excel sheet and imported the excel in matlab and wrote these lines of code. By code works but it takes so much of time may be because my file size is big. Is there any way I can rewrite this code and make it more simpler and less time and memory consuming ?
[FileName,PathName] = uigetfile('*.xls*','Select excel file');
[~,~,interpolate] = xlsread(strcat(PathName,FileName));
interpolate = interpolate(2:end,2)';
variables=Total_Data.Properties.VariableNames;
to_delete= interpolate(2:end);
index=find(ismember(variables,to_delete));
Total_Data(:,index)=[];
index=find(ismember(variables,interpolate));
Data=Total_Data(:,index);
Data = fillmissing(Data,'previous');
Total_Data = outerjoin(Total_Data,Data,'MergeKeys',true);
  댓글 수: 2
Guillaume
Guillaume 2019년 4월 24일
This is a different question from your original one, so please start a new question. It certainly shouldn't be an answer to the current question (for which you should accept an answer if one actually solved your original problem).
When you ask your new question, you will most likely be asked: Have you used the profiler to find which part of the code is slow?
Note that none of the find are needed in your code (but the impact is most likely very minimal)
index = ismember(variables,interpolate);
Data=Total_Data(:,index);
works just as well.

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Logical에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by