MATLAB Answers

Fitting the data to power law using least square method

조회 수: 73(최근 30일)
Hi all,
I try to fit the attached data in the Excel spreadsheet to the following power law expression using the least square method.
I aim to obtain a, m and n.
In the attached Excel datasheet I have y, T and P.
y = a * ( (T/298) ^m ) * ( (P/1.2) ^n )
I have double checked the forum to see any similar questions, but the one I found was for different power law expression, in which I couldn't implement my own equation into it.
I wondered if you could suggest me any approach in MATLAB.
Thank you for your time.

채택된 답변

the cyclist
the cyclist 12 Feb 2021
Here is code that will do the fit in the original space, without applying the log transform first:
% Load the data
data = readtable("power_law_using_least_square.xlsx");
% Strip off problematic first data point
data(1,:) = [];
% Define convenient variables
Y = data.Y;
% Put explanatory variables into an array
X = [data.T/298 data.P/1.2];
% Define function that will be used to fit data
% (F is a vector of fitting parameters)
f = @(F,x) F(1) .* x(:,1).^F(2) .* x(:,2).^F(3);
beta0 = [1 1 1];
mdl_original = fitnlm(X,Y,f,beta0)
mdl_original =
Nonlinear regression model: y ~ F1*x1^F2*x2^F3 Estimated Coefficients: Estimate SE tStat pValue _________ _________ _______ __________ F1 0.0030731 0.0062366 0.49276 0.6273 F2 177.68 36.819 4.8258 9.053e-05 F3 -48.904 10.073 -4.8548 8.4504e-05 Number of observations: 24, Error degrees of freedom: 21 Root Mean Squared Error: 9.16 R-Squared: 0.813, Adjusted R-Squared 0.795 F-statistic vs. zero model: 157, p-value = 1.52e-14
% Plot actual versus predicted Y in original space
line([0 90],[0 90],'LineStyle','--','LineWidth',1,'Color','k')
axis equal
title('Fit in original space')
xlabel('Y from data')
ylabel('Y from fit')
  댓글 수: 1
Mammadbaghir Baghirzade
Hi cyclist, thanks for your effort, I will try this.

댓글을 달려면 로그인하십시오.

추가 답변(2개)

the cyclist
the cyclist 11 Feb 2021
편집: the cyclist 12 Feb 2021
I would solve this by first applying a natural log transform to both sides of the equation, which then gives you a linear equation.
% Original equation
% y = a * ( (T/298) ^m ) * ( (P/1.2) ^n )
% Transform by taking the natural log of both sides
% log(y) = log(a) + m * log(T/298) + n * log(P/1.2);
% Load the data
data = readtable("power_law_using_least_square.xlsx");
% Strip off problematic first data point
data(1,:) = [];
% Define convenient variables
logY = log(data.Y);
t = log(data.T/298);
p = log(data.P/1.2);
% Fit the transformed equation
mdl = fitlm([t p],logY)
mdl =
Linear regression model: y ~ 1 + x1 + x2 Estimated Coefficients: Estimate SE tStat pValue ________ ______ _______ __________ (Intercept) -23.042 7.1617 -3.2174 0.0041334 x1 493.26 129.05 3.8224 0.00099269 x2 -135.35 35.065 -3.8599 0.00090776 Number of observations: 24, Error degrees of freedom: 21 Root Mean Squared Error: 0.998 R-squared: 0.65, Adjusted R-Squared: 0.617 F-statistic vs. constant model: 19.5, p-value = 1.63e-05
% Plot actual versus predicted Y in log space
line([0 5],[0 5],'LineStyle','--','LineWidth',1,'Color','k')
axis equal
title('Fit in log space')
xlabel('log Y from data')
ylabel('log Y from fit')
% Plot actual versus predicted Y in linear space
line([0 90],[0 90],'LineStyle','--','LineWidth',1,'Color','k')
axis equal
title('Fit in original space')
xlabel('Y from data')
ylabel('Y from fit')
If you examine the resulting plots, you will see that the fit is not very good at all. I have not investigated whether this is because I made some mistake in the code, or if your model is a poor one for your data. But I figured I would post this regardless. Maybe it will help you.
Oh, also. I removed the data point for which Y == 0. Your model definitely can't fit that. I have no advice there.
  댓글 수: 9
Mammadbaghir Baghirzade
Hi John,
Thanks for your reply. I follow the steps provided in the literature, and this is actually a theory of over 30 years, so everyone conducting the experiments in this field fit the obtained data to the relation I provided.
This is the reason that I try to fit the data to the relation I gave.

댓글을 달려면 로그인하십시오.

John D'Errico
John D'Errico 12 Feb 2021
편집: John D'Errico 12 Feb 2021
Let me look more carefully at your data, as there are some thigns that REALLY bother me about the approach you wish to use, as well as your data.
TYP = [298 0 0.986923009833903
300.101235002186 43.1990254126569 1.01209349427671
301.908520257823 65.0218477665674 1.03411766816416
305.935528457621 50.083956213335 1.08445863704977
310.780047854022 48.5052713557729 1.14738484815678
317.018449674478 47.315470757813 1.23233523315124
324.648707751109 47.4190220529795 1.34245610258851
333.805405104856 49.5689966428311 1.48404007757928
344.874810493167 46.6742563932582 1.66967240034496
356.711851888339 49.793300320307 1.88676782866414
370.55927175896 50.0405809408228 2.16678946809033
385.660757806138 50.7992995854221 2.50659100806819
402.028726805153 51.0012599219203 2.9187576908191
419.372189798347 53.6314606614739 3.40958213745378
438.366194499516 53.6258762011997 4.01682007463642
458.068137769226 49.7585908621274 4.73103257070098
476.97709145931 36.2615978957431 5.5050249673172
491.282838125128 29.0057706899755 6.1531649417194
502.975369275774 17.5628384000534 6.72579346279318
510.206038314783 13.8072020943403 7.10020441887989
515.939729045497 9.79482823125674 7.40854285330424
520.03408770692 7.04103773327845 7.63507721328947
522.987452750413 4.1988379264908 7.80183167272305
524.746678534683 0.199850325276192 7.90251361049426
524.801368151412 0.199744549683599 7.90565992104962];
T = TYP(:,1);
Y = TYP(:,2);
P = TYP(:,3);
box on
grid on
xlabel T
ylabel P
zlabel Y
UGH. This is not good data. You need to recognize that there appears to be a t least two points that are wholly inconsistent with the rest. At the very least, that will blow any such regression modeling out of the water.
Look at the plot! If I have said anything most frequently on this forum, it is to plot EVERYTHING! Then if you get bored, plot something else. Keep on plotting until you find somethign valuable. And think about EVERY plot. What does it tell you?
When I look at that figure, I see two points that apprear to be entirely inconsistent. One of them is the first data point, with Y==0. Then there is the third data point in your list, where y is 65. Again, this is entirely inconsistent with the others wher P and T are nearly the same sa at some other data points, yet Y is something completely different from the rest. Do you understand that this will blow ANY regression model out of the water?
So I would just exclude points 1 and 3. Even the second data point seems a bit strange, but leave it in for now.
Keeplist = setdiff(1:25,[1 3]);
Sadly, plotting the marginals, neither plot looks like any sort of power law behavior. And worse, T and P are highly correlated themselves. So my question becomes, why in the nae of god and little green apples do you think this data supports that model? My guess is you have no clue as to what model may be appropriate, so you decided to try a powerlaw fit.
Lets think about the model you chose. It was:
y = a * ( (T/298) ^m ) * ( (P/1.2) ^n )
What happens when T is zero? When P is zero? In both cases, Y must be exactly zero. So my first guess is that you created the first data point arbitrarily, hoping to force the curve through zero. That first point suspiciously has y EXACTLY at zero. But your data does NOT support anything of the sort!!!!!!!!!!
In fact, it looks like your data seems to indicate that as P and T are both small, the model should predict Y is roughly 50, maybe just a little less.
And that suggests your model might have a CHANCE in hell of being something like:
Y = c + a * ( (T/298) ^m ) * ( (P/1.2) ^n )
Here c will be roughly 50 when we estimate it. If we do that, when either P or T approaches zero, the model will have some chance of predicting the data.
So next, what might be reasonable values for n and m? This is a serious problem, because P and T are HIGHLY correlated. And the fact is, this latter model will still not fit well at all.
Again, the marginal plots I showed are simply not anything that looks remotely like any power law. You have a relationship that is virtually CONSTANT for a long way but at a specific point, it seems to drop off almost linearly. And this is completely inconsistent with any powerlaw behavior!
It is my STRONG recommendation that you reconsider what model you want to use here. The model should be based on reality, on physical principles if possible. Lacking that, you need to consider what you want to do with this model. Why are you trying to fit it? What will you do with that model?
  댓글 수: 2
Mammadbaghir Baghirzade
Hi John,
Thanks for your detailed answer, but I felt the agression in your reply, and did not understand the reason.
I am aware of the fact that not all the data should be plotted, and therefore I removed 2500 data from my original result.
Additionally, I am not blindly selecting the power law, and know that there are other relations might be chosen to fit the data better. As I mentioned to your previous comment, I reference the power law relation given in the literature which is being still applied since 1980s.
However, the reason I asked this question here was to get a general application of the procedur in MATLAB.
I appreciate your time. I will close this questionarie if that be possible.

댓글을 달려면 로그인하십시오.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by