# coefTest

Linear hypothesis test on linear regression model coefficients

## Syntax

``p = coefTest(mdl)``
``p = coefTest(mdl,H)``
``p = coefTest(mdl,H,C)``
``````[p,F] = coefTest(___)``````
``````[p,F,r] = coefTest(___)``````

## Description

example

````p = coefTest(mdl)` computes the p-value for an F-test that all coefficient estimates in `mdl`, except for the intercept term, are zero.```

example

````p = coefTest(mdl,H)` performs an F-test that H × B = 0, where B represents the coefficient vector. Use `H` to specify the coefficients to include in the F-test.```
````p = coefTest(mdl,H,C)` performs an F-test that H × B = C. ```

example

``````[p,F] = coefTest(___)``` also returns the F-test statistic `F` using any of the input argument combinations in previous syntaxes.```

example

``````[p,F,r] = coefTest(___)``` also returns the numerator degrees of freedom `r` for the test.```

## Examples

collapse all

Fit a linear regression model and test the coefficients of the fitted model to see if they are zero.

Load the `carsmall` data set and create a table in which the `Model_Year` predictor is categorical.

```load carsmall Model_Year = categorical(Model_Year); tbl = table(MPG,Weight,Model_Year);```

Fit a linear regression model of mileage as a function of the weight, weight squared, and model year.

`mdl = fitlm(tbl,'MPG ~ Model_Year + Weight^2')`
```mdl = Linear regression model: MPG ~ 1 + Weight + Model_Year + Weight^2 Estimated Coefficients: Estimate SE tStat pValue __________ __________ _______ __________ (Intercept) 54.206 4.7117 11.505 2.6648e-19 Weight -0.016404 0.0031249 -5.2493 1.0283e-06 Model_Year_76 2.0887 0.71491 2.9215 0.0044137 Model_Year_82 8.1864 0.81531 10.041 2.6364e-16 Weight^2 1.5573e-06 4.9454e-07 3.149 0.0022303 Number of observations: 94, Error degrees of freedom: 89 Root Mean Squared Error: 2.78 R-squared: 0.885, Adjusted R-Squared: 0.88 F-statistic vs. constant model: 172, p-value = 5.52e-41 ```

The last line of the model display shows the F-statistic value of the regression model and the corresponding p-value. The small p-value indicates that the model fits significantly better than a degenerate model consisting of only an intercept term. You can return these two values by using `coefTest`.

`[p,F] = coefTest(mdl)`
```p = 5.5208e-41 ```
```F = 171.8844 ```

Fit a linear regression model and test the significance of a specified coefficient in the fitted model by using `coefTest`. You can also use `anova` to test the significance of each predictor in the model.

Load the `carsmall` data set and create a table in which the `Model_Year` predictor is categorical.

```load carsmall Model_Year = categorical(Model_Year); tbl = table(MPG,Acceleration,Weight,Model_Year);```

Fit a linear regression model of mileage as a function of the weight, weight squared, and model year.

`mdl = fitlm(tbl,'MPG ~ Acceleration + Model_Year + Weight')`
```mdl = Linear regression model: MPG ~ 1 + Acceleration + Weight + Model_Year Estimated Coefficients: Estimate SE tStat pValue __________ __________ ________ __________ (Intercept) 40.523 2.5293 16.021 5.8302e-28 Acceleration -0.023438 0.11353 -0.20644 0.83692 Weight -0.0066799 0.00045796 -14.586 2.5314e-25 Model_Year_76 1.9898 0.80696 2.4657 0.015591 Model_Year_82 7.9661 0.89745 8.8763 6.7725e-14 Number of observations: 94, Error degrees of freedom: 89 Root Mean Squared Error: 2.93 R-squared: 0.873, Adjusted R-Squared: 0.867 F-statistic vs. constant model: 153, p-value = 5.86e-39 ```

The model display includes the p-value for the t-statistic for each coefficient to test the null hypothesis that the corresponding coefficient is zero.

You can examine the significance of the coefficient using `coefTest`. For example, test the significance of the `Acceleration` coefficient. According to the model display, `Acceleration` is the second predictor. Specify the coefficient by using a numeric index vector.

`[p_Acceleration,F_Acceleration,r_Acceleration] = coefTest(mdl,[0 1 0 0 0])`
```p_Acceleration = 0.8369 ```
```F_Acceleration = 0.0426 ```
```r_Acceleration = 1 ```

`p_Acceleration` is the p-value corresponding to the F-statistic value `F_Acceleration`, and `r_Acceleration` is the numerator degrees of freedom for the F-test. The returned p-value indicates that `Acceleration` is not statistically significant in the fitted model. Note that `p_Acceleration` is equal to the p-value of t-statistic (`tStat`) in the model display, and `F_Acceleration` is the square of `tStat`.

Test the significance of the categorical predictor `Model_Year`. Instead of testing `Model_Year_76` and `Model_Year_82` separately, you can perform a single test for the categorical predictor `Model_Year`. Specify `Model_Year_76` and `Model_Year_82` by using a numeric index matrix.

`[p_Model_Year,F_Model_Year,r_Model_Year] = coefTest(mdl,[0 0 0 1 0; 0 0 0 0 1])`
```p_Model_Year = 2.7408e-14 ```
```F_Model_Year = 45.2691 ```
```r_Model_Year = 2 ```

The returned p-value indicates that `Model_Year` is statistically significant in the fitted model.

You can also return these values by using `anova`.

`anova(mdl)`
```ans=4×5 table SumSq DF MeanSq F pValue _______ __ _______ ________ __________ Acceleration 0.36613 1 0.36613 0.042618 0.83692 Weight 1827.7 1 1827.7 212.75 2.5314e-25 Model_Year 777.81 2 388.9 45.269 2.7408e-14 Error 764.59 89 8.591 ```

## Input Arguments

collapse all

Linear regression model object, specified as a `LinearModel` object created by using `fitlm` or `stepwiselm`, or a `CompactLinearModel` object created by using `compact`.

Hypothesis matrix, specified as an `r`-by-`s` numeric index matrix, where `r` is the number of coefficients to include in an F-test, and `s` is the total number of coefficients.

• If you specify `H`, then the output `p` is the p-value for an F-test that H × B = 0, where B represents the coefficient vector.

• If you specify `H` and `C`, then the output `p` is the p-value for an F-test that H × B = C.

Example: `[1 0 0 0 0]` tests the first coefficient among five coefficients.

Data Types: `single` | `double`

Hypothesized value for testing the null hypothesis, specified as a numeric vector with the same number of rows as `H`.

If you specify `H` and `C`, then the output `p` is the p-value for an F-test that H × B = C, where B represents the coefficient vector.

Data Types: `single` | `double`

## Output Arguments

collapse all

p-value for the F-test, returned as a numeric value in the range [0,1].

Value of the test statistic for the F-test, returned as a numeric value.

Numerator degrees of freedom for the F-test, returned as a positive integer. The F-statistic has `r` degrees of freedom in the numerator and `mdl.DFE` degrees of freedom in the denominator.

## Algorithms

The p-value, F-statistic, and numerator degrees of freedom are valid under these assumptions:

• The data comes from a model represented by the formula in the `Formula` property of the fitted model.

• The observations are independent, conditional on the predictor values.

Under these assumptions, let β represent the (unknown) coefficient vector of the linear regression. Suppose H is a full-rank matrix of size r-by-s, where r is the number of coefficients to include in an F-test, and s is the total number of coefficients. Let c be a column vector with r rows. The following is a test statistic for the hypothesis that  = c:

`$F={\left(H\stackrel{^}{\beta }-c\right)}^{\prime }{\left(HV{H}^{\prime }\right)}^{-1}\left(H\stackrel{^}{\beta }-c\right).$`

Here $\stackrel{^}{\beta }$ is the estimate of the coefficient vector β, stored in the `Coefficients` property, and V is the estimated covariance of the coefficient estimates, stored in the `CoefficientCovariance` property. When the hypothesis is true, the test statistic F has an F Distribution with r and u degrees of freedom, where u is the degrees of freedom for error, stored in the `DFE` property.