Confusion about the representation of Root Mean Square, R Squared ...

조회 수: 76 (최근 30일)
Motiur
Motiur 2014년 5월 26일
댓글: Elizabeth Drybrugh 2018년 2월 9일
How are errors in Matlab represented? For example I have obtained the following after training a dataset using LinearModel.fit( ). I am confused about the Root Mean Squared Error, is the error 0.243 % or 24.3 %. I want to know the value of RMSE in terms of percentage, and is it represented here in percentage form or some other form. Can somebody please clarify. The same goes with the value of R-squared, is it 0.106% or 10.6%. Thanks.
Number of observations: 48, Error degrees of freedom: 46
Root Mean Squared Error: 0.243
R-squared: 0.106, Adjusted R-Squared 0.0861
F-statistic vs. constant model: 5.43, p-value = 0.0242

채택된 답변

Star Strider
Star Strider 2014년 5월 26일
Residuals and measures related to them are not a percentage. In the context of a one-dimensional situation, residuals are analogous to deviations from the mean, and measures derived from them are roughly analogous to the variance or standard deviation. (With heavy emphasis on ‘roughly’.)
The Coefficient of Determination (R-Squared) value could be thought of as a decimal fraction (though not a percentage), in a very loose sense. From the documentation:
  • Coefficient of determination (R-squared) indicates the proportionate amount of variation in the response variable y explained by the independent variables X in the linear regression model. The larger the R-squared is, the more variability is explained by the linear regression model.
So the higher the R-Squared value, the better the fit of the model to the data.
  댓글 수: 4
Motiur
Motiur 2014년 5월 26일
Saw that, after I commented; sorry for that. Just another thing SSE and RMSE are similar things, one has been averaged and square rooted and another is not. Is there an RMSE for GLM.Thanks.
Star Strider
Star Strider 2014년 5월 26일
My pleasure!
No worries!
Yes there is. I saw your other post and responded the your GLM RMSE question there.

댓글을 달려면 로그인하십시오.

추가 답변 (2개)

Kelly Kearney
Kelly Kearney 2014년 5월 26일
Root mean squared error is
sqrt(mean((xobs - xpre).^2))
where xobs is the input dataset, and xpre are the values predicted by the model for each corresponding observation. The value is absolute, not relative. Not quite sure what you mean by RMSE in terms of percentage... maybe percent error? Check the properties of the LinearModel object; it includes fitted values as well as several different measures of error that will help you perform this calculation.
  댓글 수: 2
Motiur
Motiur 2014년 5월 26일
I know about the calculation procedure of RMSE, however, I only wanted to know whether the value is represented as a percentage or not. I asked this because I 'think' that the R-Squared is expressed as a percentage. So is it following some sort of trend?
Kelly Kearney
Kelly Kearney 2014년 5월 26일
If you check the doc page for LinearModel, it defines all of these values for you, under properties.
No, RMSE is not a percentage, so your RMSE is 0.243 whatever-the-input-units-were, not 0.243% or 24.3%.
R^2 is the coefficient of determination, i.e. a measure of how well the model fits the data.
Most of the terms are standard statistics terms, so you if the docs aren't clear, a statistics textbook (or Wikipedia) should be able to clarify further.

댓글을 달려면 로그인하십시오.


John D'Errico
John D'Errico 2014년 5월 26일
RMSE is never expressed as a percentage that I have ever seen. Why would it be? As a percentage of what? A percentage for RMSE is meaningless. My point is for a percentage to make sense, we need to have some value A as a relative fraction of B, so then 100*A/B can be interpreted as a percentage.
(If you DO think that you need RMSE to be in the form of a percentage, I think you are mistaken.)
Likewise, R^2 is also never expressed as a percentage that I know of, although in the context I mentioned above, one can view R^2 as a ratio of the sum of squares explained divided by the total sum of squares. In that context, when one multiplies by 100, it could have a % sign attached and make sense.
Regardless, NEITHER of these parameters are expressed as percentages in the tool provided by MATLAB. Were that so, the help would say so, and do so explicitly, as that would be non-standard.
  댓글 수: 1
Elizabeth Drybrugh
Elizabeth Drybrugh 2018년 2월 9일
I can understand the confusion of thinking R2 should be a percentage as on some websites this stated. However, if this is incorrect thank you for mentioning it. I was wondering more about what R2 range is considered a good fit vs a bad-fit. Ofcourse one you plot you can see the difference visually. However, in mathematical terms, if anyone knows any good links or journals to explain this?
NOVICE at stats. Cheers, Elizabeth

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Gaussian Process Regression에 대해 자세히 알아보기

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by