Relatively easy optimization problem in Excel it's hard to implement on Matlab

조회 수: 1 (최근 30일)
Barbab
Barbab 2022년 7월 6일
편집: Torsten 2022년 7월 7일
rng('default')
% creating fake data
data = randi([-1000 +1000],30,500);
yt = randi([-1000 1000],30,1);
% creating fake missing values
row = randi([1 15],1,500);
col = rand(1,500) < .5;
% imputing missing fake values
for i = 1:500
if col(i) == 1
data(1:row(i),i) = nan;
end
end
%% here starts my problem
wgts = ones(1,500); % optimal weights needs to be binary (only zero or one)
% this would be easy with matrix formulas but I have missing values at the
% beginning of the series
for j = 1:30
xt(j,:) = sum(data(j,:) .* wgts,2,'omitnan');
end
X = [xt(3:end) xt(2:end-1) xt(1:end-2)];
y = yt(3:end);
% from here I basically need to:
% maximize the Adjusted R squared of the regression fitlm(X,y)
% by changing wgts
% subject to wgts = 1 or wgts = 0
% and optionally to impose sum(wgts,'all') = some number;
% basically I need to select the data cols with the highest explanatory
% power, omitting missing data
This is easy to implement with Excel solver, but it only can handle 200 decision variables and it takes a lot of time. Thank you in advance.
  댓글 수: 2
Sam Chak
Sam Chak 2022년 7월 6일
Unsure what went wrong. Can you show your results in Excel? It is probably better to compare the performances of having the same data.
Barbab
Barbab 2022년 7월 6일
There is nothing wrong. I am not able to implement it in Matlab. I am familiar with Matlab but I don’t know where to start with this problem

댓글을 달려면 로그인하십시오.

답변 (1개)

Torsten
Torsten 2022년 7월 6일
편집: Torsten 2022년 7월 6일
If you accept a MATLAB solution for this problem:
min: norm(X*p-y)
s.c.
xt = data*w
0<=w<=1
w integer
where p is (3x1) and "norm" is either 1-norm or max-norm, you can use intlinprog.
If you insist at maximizing adjustable r-squared, I think you will have to use ga.
  댓글 수: 3
Barbab
Barbab 2022년 7월 6일
My problem is that I have two lags of the indpendent variable and the independent variable is a linear combination in what is contained in "data"
Torsten
Torsten 2022년 7월 7일
편집: Torsten 2022년 7월 7일
Then you will have to use ga:
And - contrary to the title - I find your optimization problem is quite hard.

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Get Started with MATLAB에 대해 자세히 알아보기

제품


릴리스

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by