How does vectorized lsqcurvefit() calculate sum-of-square, row-wise or column-wise?

Question

Xingwang Yong 2021년 1월 14일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/716698-how-does-vectorized-lsqcurvefit-calculate-sum-of-square-row-wise-or-column-wise

편집: Xingwang Yong 2021년 5월 25일

채택된 답변: Matt J

MATLAB Online에서 열기

I have a 1001 measuments, in each measurement, there is a curve

y = a*exp(-b*x)

where x and y are vectors containing 8 elements, a and b are scalar parameters I want to fit. The 1001 measurements are independent and I want to get 1001 a and b.

Generally, I would do

% load measured data
% ...
% size(all_xdata) = 1001*8;
% size(all_ydata) = 1001*8;
fun1D = @(x, xdata)x(1)*exp(-x(2)*xdata) % y = a*exp(-b*x)
parfor k = 1:1001
    lsqcurvefit(fun1D,x0,all_xdata(k, :), all_ydata(k, :))
end

I learned from here vectorized lsqcurvefit() may give me some speed-up.

However, I am a little confused about how to set the shape of xdata and ydata with vectorized lsqcurvefit(). The key problem here is that I do not know how lsqcurvefit() calculate the sum-of-square. From the documentation,

fun_ND = @... % vectorized version
sum-of-squares = sum((fun_ND(x,xdata)-ydata).^2)

If the size of ydata of 1001*8, what the size of sum-of-squares would be? 1001*1 (sum along each row)or 1*8(sum along each column). Apparently, for curve-fitting, we want it to be 1001*1, i.e. the sum-of-squares should be calculated within each measurement.

For a matrix, sum() would sum along column if there is not a second input provided. I am not sure if sum((fun(x,xdata)-ydata).^2) in the documentaion is pseudo-code or not.

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

Xingwang Yong 2021년 1월 14일

MATLAB Online에서 열기

I find out myself, one can assume that sum() in lsqcurvefit() would add along row, e.g.

sum-of-squares = sum((fun_ND(x,xdata)-ydata).^2)

if ydata is m-by-n, then sum-of-squares should be m-by-1.

Here is the code, althought it can not prove that sum() add along row directly.

clear
numMeasure = 101;
numPoints = 8;
a = rand(numMeasure,1);
b = rand(numMeasure, 1);
all_xdata = ones(numMeasure, 1) * (1:numPoints);
all_ydata = a*ones(1, numPoints).*exp(-b*ones(1, numPoints).*all_xdata);
fun_ND = @(x, xdata)x(:,1)*ones(1, size(xdata, 2)).*exp(-x(:,2)*ones(1, size(xdata, 2)).*xdata);
x0 = [rand(numMeasure,1),rand(numMeasure, 1)];
abfit = lsqcurvefit(fun_ND,x0,all_xdata,all_ydata);
numDisplay = 5;
yfit = fun_ND(abfit(1:numDisplay,:), all_xdata(1:numDisplay,:));
figure;
plot(all_xdata(1:numDisplay,:)', all_ydata(1:numDisplay,:)', 'ok', ...
     all_xdata(1:numDisplay,:)', yfit', '-');

Maybe you treat sum() add along column and write another version of fun_ND(), it would also work.

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Matt J 2021년 1월 14일

1
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/716698-how-does-vectorized-lsqcurvefit-calculate-sum-of-square-row-wise-or-column-wise#answer_597748

편집: Matt J 2021년 1월 14일

MATLAB Online에서 열기

if ydata is m-by-n, then sum-of-squares should be m-by-1.

No.

The objective function minimized by lsqcurvefit is always just,

sum-of-squares = sum((fun_ND(x,xdata)-ydata).^2 ,'all')

In other words, the shape of your data matrices is irrelevant to the calculation of the objective function. However, in your case, because each row of your matrices happens to have its own independent set of parameters (a,b), the minimization is separable, i.e., the optimum (a,b) only depends on the sum-of-square terms from its corresponding row. But this has nothing to do with lsqcurvefit. It is just a feature of the parametric structure of your specific problem. If you had organized your data and parametric model column-wise instead of row-wise, however, you would get the exact same result.

Another important thing to understand with lsqcurvefit is that it does not work with your xdata, ydata, or fun_ND output in 1001x8 matrix form. It always does a preliminary reshape of all those matrices into 8008x1 column vectors (see also here). Similarly, when you supply your own sparse Jacobian calculation (which you must do to get the advantages of this technique) lsqcurvefit will expect to receive an 8008x2002 matrix.

As one consequence of this, I think it will be better for you to organize your xdata and ydata as 8x1001 matrices instead of 1001x8 (and change fun_ND accordingly). That way, your Jacobians will have a simple, sparse, block diagonal form, where each of the diagonal blocks is 8x2.

댓글 수: 8
이전 댓글 6개 표시이전 댓글 6개 숨기기

Xingwang Yong 2021년 1월 14일

MATLAB Online에서 열기

Thanks, Matt. If you are right, saying sum-of-squares is just a scalar in the vectorized version, then I'll doubt that vectorized ND version would not give the same result as the 1D version.

"the minimization is separable, i.e., the optimum (a,b) only depends on the sum-of-square terms from its corresponding row". I can not agree with this. If sum-of-squares is just a scalar, then (a1,b1) will be influenced from data from 2nd measurement, 3rd measurement... As a result, ND version will give different results than 1D version, i.e. N 1D fitting does not equal to 1 ND fitting.

Here is the code demonstrating this,

clear
numMeasure = 101; % for speed, change 1001 to 101
numPoints = 8;
a = rand(numMeasure,1); 
b = rand(numMeasure, 1);
all_xdata = ones(numMeasure, 1) * (1:numPoints);
all_ydata = a*ones(1, numPoints).*exp(-b*ones(1, numPoints).*all_xdata);
x0ND = [rand(numMeasure,1),rand(numMeasure, 1)];
fun_1D = @(x, xdata)x(1)*exp(-x(2)*xdata);
fun_ND = @(x, xdata)x(:,1)*ones(1, size(xdata, 2)).*exp(-x(:,2)*ones(1, size(xdata, 2)).*xdata);
opt = optimoptions('lsqcurvefit', 'Display', 'off');
abfit_1D = zeros(numMeasure, 2);
for k = 1:numMeasure
    abfit_1D(k, :) = lsqcurvefit(fun_1D, x0ND(k, :), all_xdata(k,:), all_ydata(k,:),[],[],opt);
end
abfit_ND = lsqcurvefit(fun_ND,x0ND,all_xdata,all_ydata,[],[],opt);
isequal(abfit_1D, abfit_ND)

Although the ND version is definitely seperable analytically, the code above indicates that it is not seperable numerically.

Matt J 2021년 1월 14일

편집: Matt J 2021년 1월 14일

MATLAB Online에서 열기

Although the ND version is definitely seperable analytically, the code above indicates that it is not seperable numerically.

Using isequal is not a meaningful test. You should be looking at percent differences. Below, I test all three different methods that we've discussed and find they differ by negligible floating point discrepancies:

numMeasure = 101; % for speed, change 1001 to 101
numPoints = 8;
a = rand(numMeasure,1); 
b = rand(numMeasure, 1);
all_xdata = ones(numMeasure, 1) * (1:numPoints);
all_ydata = a*ones(1, numPoints).*exp(-b*ones(1, numPoints).*all_xdata);
x0ND = [rand(numMeasure,1),rand(numMeasure, 1)];
fun_1D = @(x, xdata)x(1)*exp(-x(2)*xdata);
fun_rowwise = @(x, xdata)x(:,1)*ones(1, size(xdata, 2)).*exp(-x(:,2)*ones(1, size(xdata, 2)).*xdata);
fun_colwise = @(x,xdata) fun_rowwise(x.',xdata.').';
opt = optimoptions('lsqcurvefit', 'Display', 'off','OptimalityTolerance',1e-12,...
                   'StepTolerance',1e-12,'FunctionTolerance',1e-12);
abfit_1D = zeros(numMeasure, 2);
for k = 1:numMeasure
    abfit_1D(k, :) = lsqcurvefit(fun_1D, x0ND(k, :), all_xdata(k,:), all_ydata(k,:),[],[],opt);
end
abfit_rowwise = lsqcurvefit(fun_rowwise,x0ND,all_xdata,all_ydata,[],[],opt);
abfit_colwise = lsqcurvefit(fun_colwise,x0ND.',all_xdata.',all_ydata.',[],[],opt).';
%%Error analysis
percentError=max(abs(abfit_rowwise - abfit_1D),[],'all')./max(abs(abfit_1D),[],'all')*100
percentError = 7.1245e-08
percentError=max(abs(abfit_colwise - abfit_1D),[],'all')./max(abs(abfit_1D),[],'all')*100
percentError = 7.1245e-08
percentError=max(abs(abfit_colwise - abfit_rowwise),[],'all')./max(abs(abfit_rowwise),[],'all')*100
percentError = 2.2300e-14

Xingwang Yong 2021년 5월 24일

MATLAB Online에서 열기

Hi, Matt, I tried the JacobPattern option, which does not give me some speedup. More surprisingly, the vectorized version is much different from that of 1D version. So I rechecked the code you provided above, it turns out the vectorized version is not always same as that of 1D version. I tried 100 times, and 33 out of them did not match.

clear
N = 100;
counter=0;
for iter = 1:N
numMeasure = 101; % for speed, change 1001 to 101
numPoints = 8;
a = rand(numMeasure,1); 
b = rand(numMeasure, 1);
all_xdata = ones(numMeasure, 1) * (1:numPoints);
all_ydata = a*ones(1, numPoints).*exp(-b*ones(1, numPoints).*all_xdata);
x0ND = [rand(numMeasure,1),rand(numMeasure, 1)];
fun_1D = @(x, xdata)x(1)*exp(-x(2)*xdata);
fun_rowwise = @(x, xdata)x(:,1)*ones(1, size(xdata, 2)).*exp(-x(:,2)*ones(1, size(xdata, 2)).*xdata);
fun_colwise = @(x,xdata) fun_rowwise(x.',xdata.').';
opt = optimoptions('lsqcurvefit', 'Display', 'off','OptimalityTolerance',1e-12,...
                   'StepTolerance',1e-12,'FunctionTolerance',1e-12);
abfit_1D = zeros(numMeasure, 2);
for k = 1:numMeasure
    abfit_1D(k, :) = lsqcurvefit(fun_1D, x0ND(k, :), all_xdata(k,:), all_ydata(k,:),[],[],opt);
end
abfit_rowwise = lsqcurvefit(fun_rowwise,x0ND,all_xdata,all_ydata,[],[],opt);
abfit_colwise = lsqcurvefit(fun_colwise,x0ND.',all_xdata.',all_ydata.',[],[],opt).';
%%Error analysis
percentError1=max(abs(abfit_rowwise - abfit_1D),[],'all')./max(abs(abfit_1D),[],'all')*100;
percentError2=max(abs(abfit_colwise - abfit_1D),[],'all')./max(abs(abfit_1D),[],'all')*100;
percentError3=max(abs(abfit_colwise - abfit_rowwise),[],'all')./max(abs(abfit_rowwise),[],'all')*100;
if any([percentError1,percentError2]>100)
    counter = counter + 1;
end
end

Matt J 2021년 5월 24일

I tried 100 times, and 33 out of them did not match.

You have to check the exitflag and make sure that all 3 versions converged. Cases where convergence did not occur should not be counted.

Xingwang Yong 2021년 5월 25일

편집: Xingwang Yong 2021년 5월 25일

MATLAB Online에서 열기

Matt, you are right. I add bound constraints as John D'Errico suggested in his optimization tips (Chapter 17), then all 3 versions converged and match with each other.

clear
N = 100;
counter=0;
for iter = 1:N
    %% 
    disp([num2str(iter), ':']);
    numMeasure = 101; % number of curves, for speed, change 1001 to 101
    numPoints = 8; % number of data points on each curve
    numParams = 2; % number of unknown parameters
    
    a = rand(numMeasure,1);
    b = rand(numMeasure, 1);
    all_xdata = ones(numMeasure, 1) * (1:numPoints);
    all_ydata = a*ones(1, numPoints).*exp(-b*ones(1, numPoints).*all_xdata);
    
    x0ND = [rand(numMeasure,1),rand(numMeasure, 1)];
    lb = [0,0];
    ub = [1,1];
    LB = repmat(lb,numMeasure,1);
    UB = repmat(ub,numMeasure,1);
    fun_1D = @(x, xdata)x(1)*exp(-x(2)*xdata);
    fun_rowwise = @(x, xdata)x(:,1)*ones(1, size(xdata, 2)).*exp(-x(:,2)*ones(1, size(xdata, 2)).*xdata);
    fun_colwise = @(x,xdata) fun_rowwise(x.',xdata.').';
    
    opt = optimoptions('lsqcurvefit', 'Display', 'off','OptimalityTolerance',1e-12,...
        'StepTolerance',1e-12,'FunctionTolerance',1e-12);
    abfit_1D = zeros(numMeasure, 2);
    exit_flags_1D = zeros(numMeasure, 1);
    output_1D = cell(numMeasure, 1);
    
    opt1 = opt; % jacob pattern for row-wise
    opt1.JacobPattern = repmat(speye(numMeasure), numPoints, numParams);
    
    opt2 = opt; % jacob pattern for col-wise
    tmpCell = cell(numMeasure,numMeasure);
    tmpCell(:) = {sparse(zeros(numPoints,numParams))};
    for tmpIter = 1:numMeasure
        tmpCell{tmpIter,tmpIter} = sparse(ones(numPoints, numParams));
    end
    jp = cell2mat(tmpCell);
    opt2.JacobPattern = jp;
    
    %%
    % 1D
    tic
    for k = 1:numMeasure
        [abfit_1D(k, :),~,~,exit_flags_1D(k),output_1D{k}] = lsqcurvefit(fun_1D, x0ND(k, :), all_xdata(k,:), all_ydata(k,:),lb,ub,opt);
    end
    toc
    
    % row wise
    tic
    [abfit_rowwise,~,~,exit_flag1,output1] = lsqcurvefit(fun_rowwise,x0ND,all_xdata,all_ydata,LB,UB,opt1);
    toc
    % col wise
    tic
    abfit_colwise = lsqcurvefit(fun_colwise,x0ND.',all_xdata.',all_ydata.',LB.',UB.',opt2).';
    toc
    [~,~,~,exit_flag2,output2] = lsqcurvefit(fun_colwise,x0ND.',all_xdata.',all_ydata.',LB.',UB.',opt2);
    
    % Error analysis
    percentError1=max(abs(abfit_rowwise - abfit_1D),[],'all')./max(abs(abfit_1D),[],'all')*100;
    percentError2=max(abs(abfit_colwise - abfit_1D),[],'all')./max(abs(abfit_1D),[],'all')*100;
    percentError3=max(abs(abfit_colwise - abfit_rowwise),[],'all')./max(abs(abfit_rowwise),[],'all')*100;
    
    if any([percentError1,percentError2]>100)
        counter = counter + 1;
    end
end
counter

댓글을 달려면 로그인하십시오.

How does vectorized lsqcurvefit() calculate sum-of-square, row-wise or column-wise?

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

채택된 답변

댓글 수: 8
이전 댓글 6개 표시이전 댓글 6개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

Community Treasure Hunt

How does vectorized lsqcurvefit() calculate sum-of-square, row-wise or column-wise?

댓글 수: 1 이전 댓글 -1개 표시이전 댓글 -1개 숨기기

채택된 답변

댓글 수: 8 이전 댓글 6개 표시이전 댓글 6개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

Community Treasure Hunt

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

댓글 수: 8
이전 댓글 6개 표시이전 댓글 6개 숨기기