Different results with Polyfit?

조회 수: 6 (최근 30일)
Jules Ray
Jules Ray 2017년 3월 3일
편집: John D'Errico 2017년 3월 3일
I'm using polyfit to get a simple lineal regression of degree 1
howver the coefficients from the lineal equation varies slightly. Is not a big variation but each time I run polyfit the coeficients are slightly different than the run before.
Indeed if I run polyfit in a loop or using a bootstrap I obtained several similar values but slightly different.
I know the results is ok and is probably related with the uncertainty of polyfit estimation, but I dont understand the mathematics behind this slight variation.
Does anyone have an idea why this happens?
Cheers

채택된 답변

John D'Errico
John D'Errico 2017년 3월 3일
It does not happen. IF you pass in exactly the same data (in the same order) into polyfit, calling polyfit the same way each time, it will return exactly the same result, time after time after time.
If you change the data in any way, then of course you will get different results. In fact, even changing the order of the data points is sufficient.
x = randn(100,1);
y = randn(100,1);
P1 =polyfit(x,y,1);
s = randperm(100);
P2 = polyfit(x(s),y(s),1);
P1 == P2
ans =
1×2 logical array
0 0
P1 - P2
ans =
1.7347e-17 6.9389e-17
So all I did was permute the data. Change the order of ANY set of floating point additions, subtracts, multiplies, etc., and you can get a slightly different result.
0.3 - 0.2 - 0.1
ans =
-2.7756e-17
-0.1 - 0.2 + 0.3
ans =
-5.5511e-17
What I have described is NOT due to the accuracy of polyfit, but simply a basic feature of floating point arithmetic. If you are having a different problem, I cannot guess what it is, since you have told us virtually nothing about what you really did. People screw up all sorts of things in all sorts of different ways. So without knowing true specifics, all we can do is make guesses.
  댓글 수: 2
Jules Ray
Jules Ray 2017년 3월 3일
편집: Jules Ray 2017년 3월 3일
Thanks for your answer... maybe I'm missunderstanding something here an example
if true
clear
X_E=[257180.148132324,257182.200988770,257183.951254355,257185.459594727,257188.927673340,257191.060831488,257193.484331052,257195.999755230,257198.445690139,257200.644455053,257202.582223719,257204.208745053,257205.498291016,257206.860518441,257207.425048828,257208.542259765,257211.649983677,257212.819885254,257213.430984312,257213.955834038,257215.425800405,257216.288452148,257217.560921910,257219.270987033,257221.237270770,257227.119033113,257228.795787703,257230.161193848,257231.636797666,257234.892593131,257236.712341309,257240.565673828,257243.738312829,257246.554913088,257248.971808274,257250.970520020,257253.111625686,257254.053405762,257255.256326070,257258.240346596,257259.448242188,257260.668271778,257263.137341217,257264.457885742,257265.430337653,257267.760200739,257268.697021484,257271.394470215,257273.033179828,257276.404113770,257280.206731603,257287.889458686,257291.017592100,257294.171966406,257296.828247070,257298.714889681,257303.110915463,257305.305908203,257308.438319313,257309.544860840,257310.066639695,257310.900325875,257311.471618652,257312.162379056,257314.169325606,257314.939941406,257315.341124861,257315.646524022,257316.445689880,257316.793421783];
Y_E=[5596222.34149170,5596231.86547852,5596232.07256538,5596232.58905029,5596235.28668213,5596236.33734967,5596237.16789960,5596237.71112480,5596237.93261018,5596237.82725106,5596237.40775195,5596236.68531664,5596235.67193604,5596233.58199607,5596232.97448731,5596232.55405675,5596232.52503967,5596232.20361328,5596231.68610531,5596230.90280274,5596227.01130997,5596225.65270996,5596224.73418842,5596224.14679759,5596223.86592936,5596223.48732638,5596223.12727301,5596222.56982422,5596221.53931947,5596218.59161443,5596217.56018066,5596216.78955078,5596215.75764809,5596214.50269830,5596213.04436437,5596211.39428711,5596208.42910900,5596207.54095459,5596207.07512067,5596206.73276630,5596206.38488770,5596205.51530027,5596202.89199862,5596202.14593506,5596202.05706356,5596202.29071064,5596202.14593506,5596200.21893311,5596199.85311049,5596199.83367920,5596199.93702897,5596200.97563792,5596201.20824736,5596200.99804124,5596200.21893311,5596199.10731173,5596195.42072160,5596194.05340576,5596193.13523611,5596192.51190186,5596191.76819885,5596189.69076160,5596189.04382324,5596188.83174319,5596188.84636981,5596188.65838623,5596188.29374978,5596187.68107973,5596184.40266686,5596183.89414149];
iterations=1000; [p_bootstrp_E,bb] = bootstrp(iterations,'polyfit',X_E,Y_E,1);
end
John D'Errico
John D'Errico 2017년 3월 3일
편집: John D'Errico 2017년 3월 3일
I think you don't understand what a bootstrap does.
https://en.wikipedia.org/wiki/Bootstrapping_(statistics)
In there, the very first line says "In statistics, bootstrapping is any test or metric that relies on random sampling with replacement."
The help for bootstrp says the same thing.
"bootstrp creates each bootstrap sample by sampling with replacement from the rows of the non-scalar data arguments"
The variability is NOT due to polyfit. This is not a question of each time you use polyfit. Infact, you are not calling polyfit directly at all. The issue is which subset of your data that bootstrp passes to polyfit. When you do random sampling, you will get different results. Can you expect a fixed result given random sampling?

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by