least absolute deviation when we have data set

Question

NA 2020년 8월 22일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/582995-least-absolute-deviation-when-we-have-data-set

편집: Terry nichols 2020년 12월 25일

I have this data

x = (1:10)';
y = 10 - 2*x + randn(10,1);
y(10) = 0;

how can I use least absolute value regression?

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Bjorn Gustavsson 2020년 8월 22일

1
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/582995-least-absolute-deviation-when-we-have-data-set#answer_483416

MATLAB Online에서 열기

You can do it rather straight-forwardly with fminsearch (or other similar tools on the file exchange: fminsearchbnd, minimize etc):

M = [ones(size(x)),x]; % Matrix for linear LSQ-regression, we could do centering and scaling etc...
p0 = M\y;           % straight least-square fit - to get ourselves a sensible start-guess (hopefully)
errfcn = @(p,y,M) sum(abs(y-M*p)); % L1 error-function 
p1 = fminsearch(@(p) errfcn(p,y,M),p0); % L1-optimization
subplot(2,1,1)
plot(x,y,'.-')
hold on
plot(x,M*p0)
plot(x,M*p1)
subplot(2,1,2)
plot(x,y*0,'.-')
hold on
plot(x,y-M*p0,'.-')
plot(x,y-M*p1,'.-')
% For my test I got L1-error-function-value for the least-square-fit p0:
% errfcn(p0,y,M)
% ans =
%        22.058
% and for the L1-optimal parameters:
% >> errfcn(p1,y,M)
% ans =
%       20.067

This would generalize to more interesting problems too. Also have a look at Huber-norms, for an error-norm kind of intermediate between L1 and L2.

HTH

댓글 수: 8
이전 댓글 6개 표시이전 댓글 6개 숨기기

Bjorn Gustavsson 2020년 8월 29일

편집: Bjorn Gustavsson 2020년 8월 29일

MATLAB Online에서 열기

My Huber-like-norm function looks something like this:

function errs = normHuberlike(y,ymod,sigma)
errs = sigma.^2.*(sqrt(1+(y-ymod).^2./sigma.^2)-1)
end

So that defines how the errors in errs vary with the scaled deviation. Then you'll have to use that in your error-function, something like this:

M = [ones(size(x)),x]; % Matrix for linear LSQ-regression, we could do centering and scaling etc...
p0 = M\y;           % straight least-square fit - to get ourselves a sensible start-guess (hopefully)
sigma = 1; % just an arbitrary setting for where the transition between quadrtic and linear behaviour should be
errfcn = @(p,y,M,sigma) sum(normHuberlike(y,M*p,sigma)); % L1 error-function 
p1 = fminsearch(@(p) errfcn(p,y,M,sigma),p0); % L1-optimization

One thing you should "always" do is to plot your norm-function to get a sense of how they weigh residuals of different magniture:

dy = linspace(-4,4,401);
plot(dy,[abs(dy);dy.^2/2;normHuberlike(dy,0*dy,1)])

NA 2020년 9월 6일

편집: NA 2020년 9월 11일

Thank you for taking the time to answer my question. I used your code and compare with 'robustfit'.

Why I could not get same regression?

Bjorn Gustavsson 2020년 9월 7일

Because they use different algorithms, and from the robust-fit documentation you can look up the weighting used for its different settings of wfun and tune. Do the regressions differ by much? How do they vary if you vary the different tuning-parameters? When using robust fitting you should always check the residuals and their relative contributions to the total error-function.

댓글을 달려면 로그인하십시오.

Answer 2

Bruno Luong 2020년 8월 22일

3
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/582995-least-absolute-deviation-when-we-have-data-set#answer_483413

편집: Bruno Luong 2020년 8월 22일

MATLAB Online에서 열기

% Test data
x = (1:10)';
y = 10 - 2*x + randn(10,1);
y(10) = 0;
order = 1; %  polynomial order
M = x(:).^(0:order);
m = size(M,2);
n = length(x);
Aeq = [M, speye(n,n), -speye(n,n)];
beq = y(:);
c = [zeros(1,m) ones(1,2*n)]';
%
LB = [-inf(1,m) zeros(1,2*n)]';
% no upper bounds at all.
UB = [];
sol =  linprog(c, [], [], Aeq, beq, LB, UB);
Pest = sol(m:-1:1); % here is the polynomial
% Check
clf(figure(1));
plot(x, y, 'or', x, polyval(Pest,x), 'b');

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

Bruno Luong 2020년 12월 22일

편집: Bruno Luong 2020년 12월 22일

MATLAB Online에서 열기

"2) I am totally new to the ways of linear programming, so I am wondering how come you have no inequality constraints? I am guessing you are saying the solution must adhere to the objective function, precisely? "

Because I don't need it. I formulate the problem as

M*P - u + v = y

where u and v a extra variables, they meant to be positive

v =( M*P - y) = u

so

argmin (u + v) is sum(abs( M*P - y)) is L1 norm of the fit.

I could formulate with inequality but they are equivalent. There is no unique way to formulate LP, as long as it does what we want.

And as comment; all LP can be showed to be equivalent to a "canonical form" where all the inequalities are replaced by only linear equalities + positive bounds

argmin f'*x
    A*x = b
    x >= 0

Terry nichols 2020년 12월 22일

편집: Terry nichols 2020년 12월 25일

Thanks much for all of your help!

댓글을 달려면 로그인하십시오.

least absolute deviation when we have data set

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 8
이전 댓글 6개 표시이전 댓글 6개 숨기기

추가 답변 (1개)

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

참고 항목

카테고리

태그

Community Treasure Hunt

least absolute deviation when we have data set

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 8 이전 댓글 6개 표시이전 댓글 6개 숨기기

추가 답변 (1개)

댓글 수: 3 이전 댓글 1개 표시이전 댓글 1개 숨기기

참고 항목

카테고리

태그

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 8
이전 댓글 6개 표시이전 댓글 6개 숨기기

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기