How does fsrftest calculate the p-value?

조회 수: 11 (최근 30일)
Isaiah
Isaiah 2023년 12월 18일
편집: Ive J 2024년 1월 8일
I am trying to understand how the fsrftest works in MATLAB. From the documentation, I understand that it uses an F-Test to test a null hypothesis and alternative hypothesis. Subsequently the p-value is used to determine the importance of the feature. From my understanding the p-value is also not compared with a significance level and as such this function does not actually reject/accept either hypothesis but rather just uses the p-value to rank features.
My question is regarding how is the p-value calculated? Is the process the same as ANOVA?

채택된 답변

Ive J
Ive J 2024년 1월 8일
편집: Ive J 2024년 1월 8일
At the end of doc you can see it uses -log(p) to rank features, so there is no significance level here. And yes, it's same as ANOVA (to be precise, it's a GLM), note that NumBins argument is used to bin continuous features.
n = 100; % sample size
data = table;
data.BMI = randi([18, 50], n, 1);
% bin BMI into two categories
med_bmi = median(data.BMI);
idx = data.BMI > med_bmi;
data.BMI(idx) = 1;
data.BMI(~idx) = 0;
data.Sex = randi([0, 1], n, 1);
data.Target = randn(n, 1);
mdl_bmi = fitlm(data(:, ["BMI", "Target"]))
mdl_bmi =
Linear regression model: Target ~ 1 + BMI Estimated Coefficients: Estimate SE tStat pValue _________ _______ ________ _______ (Intercept) 0.04267 0.13963 0.30559 0.76056 BMI -0.067441 0.19746 -0.34153 0.73343 Number of observations: 100, Error degrees of freedom: 98 Root Mean Squared Error: 0.987 R-squared: 0.00119, Adjusted R-Squared: -0.009 F-statistic vs. constant model: 0.117, p-value = 0.733
mdl_sex = fitlm(data(:, ["Sex", "Target"]))
mdl_sex =
Linear regression model: Target ~ 1 + Sex Estimated Coefficients: Estimate SE tStat pValue ________ _______ ________ _______ (Intercept) -0.10768 0.14984 -0.71864 0.47407 Sex 0.20462 0.19847 1.031 0.30509 Number of observations: 100, Error degrees of freedom: 98 Root Mean Squared Error: 0.983 R-squared: 0.0107, Adjusted R-Squared: 0.000635 F-statistic vs. constant model: 1.06, p-value = 0.305
[~, sc] = fsrftest(data, "Target", "NumBins", 2);
p = exp(-sc)
p = 1×2
0.7334 0.3051

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Analysis of Variance and Covariance에 대해 자세히 알아보기

제품


릴리스

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by