Main Content

testDeviance

Deviance test for multinomial regression model

Since R2023a

    Description

    p = testDeviance(mdl) returns the p-value for a test that determines whether the fitted model in the MultinomialRegression model object mdl fits significantly better than an intercept-only model.

    example

    [p,testStat] = testDeviance(mdl) also returns the value of the test statistic used to generate the p-value.

    example

    Examples

    collapse all

    Load the fisheriris sample data set.

    load fisheriris

    The column vector species contains three iris flower species: setosa, versicolor, and virginica. The matrix meas contains of four types of measurements for the flowers: the length and width of sepals and petals in centimeters.

    Fit a multinomial regression model using meas as the predictor data and species as the response data.

    mdl = fitmnr(meas,species);

    mdl is a multinomial regression model object that contains the results of fitting a nominal multinomial regression model to the data.

    Perform a chi-squared test with the null hypothesis that an intercept-only model performs as well as the model mdl.

    p = testDeviance(mdl)
    p = 
    7.0555e-64
    

    The small p-value indicates that enough evidence exists to reject the null hypothesis and conclude that mdl performs better than the intercept-only model.

    Load the carbig sample data set.

    load carbig

    The variables MPG and Origin contain data for car mileage and country of origin, respectively.

    Fit a multinomial regression model with MPG as the predictor data and Origin as the response. Estimate the dispersion parameter during the fitting.

    mdl = fitmnr(MPG,Origin,EstimateDispersion=true);

    mdl is a multinomial regression model object that contains the results of fitting a nominal multinomial regression model to the data.

    Perform an F-test with the null hypothesis that an intercept-only model fits the data as well as the model mdl. Display the p-value and the F-statistic.

    [p,tStats] = testDeviance(mdl)
    p = 
    1.2314e-45
    
    tStats = 
    39.1789
    

    The small p-value indicates that enough evidence exists to reject the null hypothesis and conclude that mdl performs better than the intercept-only model.

    Input Arguments

    collapse all

    Multinomial regression model object, specified as a MultinomialRegression model object created with the fitmnr function.

    Output Arguments

    collapse all

    Deviance test p-value, returned as a numeric scalar in the range [0,1].

    Deviance test statistic, returned as a numeric scalar. If mdl.Dispersion is estimated, testDeviance performs an F-test to determine whether the fitted model mdl fits better than an intercept-only model. If mdl.Dispersion is not estimated, testDeviance performs a chi-squared test instead.

    More About

    collapse all

    Deviance

    Deviance is a generalization of the residual sum of squares. It measures the goodness of fit compared to a saturated model.

    The deviance of a model M1 is twice the difference between the loglikelihood of the model M1 and the saturated model Ms. A saturated model is a model with the maximum number of parameters that you can estimate.

    For example, if you have n observations (yi, i = 1, 2, ..., n) with potentially different values for XiTβ, then you can define a saturated model with n parameters. Let L(b,y) denote the maximum value of the likelihood function for a model with the parameters b. Then the deviance of the model M1 is

    2(logL(b1,y)logL(bS,y)),

    where b1 and bs contain the estimated parameters for the model M1 and the saturated model, respectively. The deviance has a chi-squared distribution with np degrees of freedom, where n is the number of parameters in the saturated model and p is the number of parameters in the model M1.

    Assume you have two different generalized linear regression models M1 and M2, and M1 has a subset of the terms in M2. You can assess the fit of the models by comparing their deviances D1 and D2. The difference of the deviances is

    D=D2D1=2(logL(b2,y)logL(bS,y))+2(logL(b1,y)logL(bS,y))=2(logL(b2,y)logL(b1,y)).

    Asymptotically, the difference D has a chi-squared distribution with degrees of freedom v equal to the difference in the number of parameters estimated in M1 and M2. You can obtain the p-value for this test by using 1  —  chi2cdf(D,v).

    Typically, you examine D using a model M2 with a constant term and no predictors. Therefore, D has a chi-squared distribution with p – 1 degrees of freedom. If the dispersion is estimated, the difference divided by the estimated dispersion has an F distribution with p – 1 numerator degrees of freedom and np denominator degrees of freedom.

    Alternative Functionality

    coefTest performs an F-test to determine whether the coefficient estimates in mdl are zero. If you do not specify coefficients to test, coefTest tests whether the model mdl is a better fit to the data than a model with no coefficients.

    Version History

    Introduced in R2023a