interpolating missing data
조회 수: 16 (최근 30일)
이전 댓글 표시
Hi all,
I'm trying to estimate model parameters in MATLAB using data I collected in the lab, but I didn't measure all of the variables every day (so for some days I only have data for one variable). The data look like this (time; variable 1; variable 2; variable 3):
1 2330000 5.92275000000000e-06 36.2000000000000
2 52900000 2.79773000000000e-07 35.2000000000000
3 357000000 6.69468000000000e-08 26.1000000000000
4 389000000 1.19846000000000e-07 3.38000000000000
5 668000000 7.43263000000000e-08 0.350000000000000
6 1100000000.00000 4.52455000000000e-08 0.230000000000000
7 1530000000.00000 3.24575000000000e-08 0.340000000000000
8 1250000000.00000 3.96000000000000e-08 0.500000000000000
9 1490000000.00000 3.33154000000000e-08 0.360000000000000
10 1850000000.00000 NaN NaN
12 2050000000.00000 2.42585000000000e-08 0.270000000000000
14 2290000000.00000 NaN NaN
17 2120000000.00000 NaN NaN
19 5090000000.00000 9.79568000000000e-09 0.140000000000000
I've found a way to deal with this by replacing the NaN's with 0s, but I really don't want to do that in this case since it would screw up the estimation. I read something about interpolating the missing data using interp1 but I haven't been able to get that to work. Any help would be much appreciated. Thank you!
댓글 수: 0
채택된 답변
Sven
2011년 12월 1일
Let's start with your data.
data = [1 2330000 5.92275000000000e-06 36.2000000000000
2 52900000 2.79773000000000e-07 35.2000000000000
3 357000000 6.69468000000000e-08 26.1000000000000
4 389000000 1.19846000000000e-07 3.38000000000000
5 668000000 7.43263000000000e-08 0.350000000000000
6 1100000000.00000 4.52455000000000e-08 0.230000000000000
7 1530000000.00000 3.24575000000000e-08 0.340000000000000
8 1250000000.00000 3.96000000000000e-08 0.500000000000000
9 1490000000.00000 3.33154000000000e-08 0.360000000000000
10 1850000000.00000 NaN NaN
12 2050000000.00000 2.42585000000000e-08 0.270000000000000
14 2290000000.00000 NaN NaN
17 2120000000.00000 NaN NaN
19 5090000000.00000 NaN 0.140000000000000]
Now here's how you can use interp1, looped over each column. I've updated it to handle NaN values on the end that can't be addressed with pure interpolation:
fullData = data;
for c = 2:size(data,2)
nanRows =
fullData(nanRows,c) = interp1(data(~nanRows,1), data(~nanRows,c), data(nanRows,1));
nanRows = isnan(data(:,c));
fullData(nanRows,c) = interp1(data(~nanRows,1), data(~nanRows,c), data(nanRows,1), 'nearest','extrap');
end
댓글 수: 2
Sven
2011년 12월 2일
Yes, is is a small annoyance I have with interp1. Note the difference between _interpolation_ and _extrapolation_. For the former, you need a value above *and* below your query point. I assume that what you really want to do is:
1. Interpolate *linearly* for any _internal_ NaNs.
2. Set those NaN values on the outside to their nearest non-NaN neighbour's value.
My two most-used modes for *interp1* are 'linear' or 'nearest'. There's also an 'extrap' option to extrapolate. But since the above points one and two use different _forms_ of interpolation/extrapolation, you can't do this in one line.
What I do is run two interp commands... one to linearly interpolate, and one to 'nearestly' exrapolate. I've updated the answer accordingly.
추가 답변 (0개)
참고 항목
카테고리
Help Center 및 File Exchange에서 Smoothing에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!