Interpolation of in-between values in a list of different groups
조회 수: 2 (최근 30일)
이전 댓글 표시
Hi,
I have a dataset zerocd, from which I uploaded an extract. The variable zerocd.days can be understand as days in the future from the date of the same row. What I want to do is to interpolate the rates for days = 1:100 on each date. The interpolation should be based on the rate vector of each zerocd.date, so that for a specific date the rates of alle all other dates are irrelevant. The interpolation should be done for every date in the dataset.
As there are 3 different dates in my extract, the goal is to have a table in the same structure as zerocd, but it should contain 300 rates, one for every combination of date (the three dates given) and days(1:100).
Please keep in mind, that the values of zerocd.days are not the same for every zerocd.date!!!
Right now, my code looks like this and is very very slow. Do you have suggestions for improvement?
% Get unique zerocd dates.
date = unique(zerocd.date);
n_date = numel(date);
% Get query vector for interpolation.
days_queried = (1:100)';
% Construct new table for interpolated values.
rate = nan(n_date, numel(days_queried));
Interpolated = table(date, rate);
Interpolated = splitvars(Interpolated, 'rate');
% Do rate-interpolation for all days on each date i
for i = 1:numel(date)
daily_zerocd = zerocd(zerocd.date == Interpolated.date(i),:);
daily_Interpolant = griddedInterpolant(daily_zerocd.days, ...
daily_zerocd.rate, 'linear', 'linear');
Interpolated{i,2:end} = daily_Interpolant(days_queried)';
end
% Stack to list format again.
Interpolated = stack(Interpolated,2:size(Interpolated,2),...
'NewDataVariableName','rate',...
'IndexVariableName','days');
Interpolated.days = repmat(Xq,n_date,1);
EDIT: Here is a picture of the general principle.
댓글 수: 0
채택된 답변
Guillaume
2020년 3월 31일
편집: Guillaume
2020년 3월 31일
I've not tried to understand your code to see where the slow processing is (edit: it's probably the stack) There's no reason for this processing to be slow.
Here is how I'd do it. The splitapply could be replaced by an explicit loop which may even be faster:
% Get query vector for interpolation.
days_queried = (1:100)';
%find unique date and assign unique ID to rows of the table
[rowid, date] = findgroups(zerocd.date); %or [date, ~, rowid] = unique(zerocd.data); %if using unique
%interpolate relevant rows for each unique day
%with splitapply, output a scalar cell array containing a column vector of interpolated rate for each day.
daily_zerocd = splitapply(@(inday, inrate) {interp1(inday, inrate, days_queried, 'linear', 'extrap')}, zerocd.days, zerocd.rate, rowid);
%stuff it all in a table
Interpolated = table(repelem(date, numel(days_queried)), repmat(days_queried, numel(date), 1), vertcat(daily_zerocd{:}), ...
'VariableNames', {'date', 'days', 'rate'})
댓글 수: 0
추가 답변 (0개)
참고 항목
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!