# anova

Analysis of variance for between-subject effects in a repeated measures model

## Description

example

anovatbl = anova(rm) returns the analysis of variance results for the repeated measures model rm.

example

anovatbl = anova(rm,'WithinModel',WM) returns the analysis of variance results it performs using the response or responses specified by the within-subject model WM.

## Examples

collapse all

The column vector species consists of iris flowers of three different species: setosa, versicolor, and virginica. The double matrix meas consists of four types of measurements on the flowers: the length and width of sepals and petals in centimeters, respectively.

Store the data in a table array.

t = table(species,meas(:,1),meas(:,2),meas(:,3),meas(:,4),...
'VariableNames',{'species','meas1','meas2','meas3','meas4'});
Meas = dataset([1 2 3 4]','VarNames',{'Measurements'});

Fit a repeated measures model where the measurements are the responses and the species is the predictor variable.

rm = fitrm(t,'meas1-meas4~species','WithinDesign',Meas);

Perform analysis of variance.

anova(rm)
ans=3×7 table
Within     Between     SumSq     DF     MeanSq       F         pValue
________    ________    ______    ___    _______    ______    ___________

Constant    constant    7201.7      1     7201.7     19650    2.0735e-158
Constant    species     309.61      2      154.8    422.39     1.1517e-61
Constant    Error       53.875    147    0.36649

There are 150 observations and 3 species. The degrees of freedom for species is 3 - 1 = 2, and for error it is 150 - 3 = 147. The small $p$-value of 1.1517e-61 indicates that the measurements differ significantly according to species.

The dataset array, panelData, contains yearly observations on eight cities for 6 years. The first variable, Growth, measures economic growth (the response variable). The second and third variables are city and year indicators, respectively. The last variable, Employ, measures employment (the predictor variable). This is simulated data.

Store the data in a table array and define city as a nominal variable.

t = table(panelData.Growth,panelData.City,panelData.Year,...
'VariableNames',{'Growth','City','Year'});

Convert the data in a proper format to do repeated measures analysis.

t = unstack(t,'Growth','Year','NewDataVariableNames',...
{'year1','year2','year3','year4','year5','year6'});

Add the mean employment level over the years as a predictor variable to the table t.

t(:,8) = table(grpstats(panelData.Employ,panelData.City));
t.Properties.VariableNames{'Var8'} = 'meanEmploy';

Define the within-subjects variable.

Year = [1 2 3 4 5 6]';

Fit a repeated measures model, where the growth figures over the 6 years are the responses and the mean employment is the predictor variable.

rm = fitrm(t,'year1-year6 ~ meanEmploy','WithinDesign',Year);

Perform analysis of variance.

anovatbl = anova(rm,'WithinModel',Year)
anovatbl=3×7 table
Within       Between        SumSq       DF      MeanSq         F         pValue
_________    __________    __________    __    __________    ________    _________

Contrast1    constant          588.17    1         588.17    0.038495      0.85093
Contrast1    meanEmploy    3.7064e+05    1     3.7064e+05      24.258    0.0026428
Contrast1    Error              91675    6          15279

The matrix Y contains response data for 16 individuals. The response is the blood level of a drug measured at five time points (time = 0, 2, 4, 6, and 8). Each row of Y corresponds to an individual, and each column corresponds to a time point. The first eight subjects are female, and the second eight subjects are male. This is simulated data.

Define a variable that stores gender information.

Gender = ['F' 'F' 'F' 'F' 'F' 'F' 'F' 'F' 'M' 'M' 'M' 'M' 'M' 'M' 'M' 'M']';

Store the data in a proper table array format to do repeated measures analysis.

t = table(Gender,Y(:,1),Y(:,2),Y(:,3),Y(:,4),Y(:,5),...
'VariableNames',{'Gender','t0','t2','t4','t6','t8'});

Define the within-subjects variable.

Time = [0 2 4 6 8]';

Fit a repeated measures model, where blood levels are the responses and gender is the predictor variable.

rm = fitrm(t,'t0-t8 ~ Gender','WithinDesign',Time);

Perform analysis of variance.

anovatbl = anova(rm)
anovatbl=3×7 table
Within     Between     SumSq     DF    MeanSq      F         pValue
________    ________    ______    __    ______    ______    __________

Constant    constant     54702     1     54702    1079.2    1.1897e-14
Constant    Gender      2251.7     1    2251.7    44.425    1.0693e-05
Constant    Error        709.6    14    50.685

There are 2 genders and 16 observations, so the degrees of freedom for gender is (2 - 1) = 1 and for error it is (16 - 2)*(2 - 1) = 14. The small $p$-value of 1.0693e-05 indicates that there is a significant effect of gender on blood pressure.

Repeat analysis of variance using orthogonal contrasts.

anovatbl = anova(rm,'WithinModel','orthogonalcontrasts')
anovatbl=15×7 table
Within     Between       SumSq       DF      MeanSq          F           pValue
________    ________    __________    __    __________    __________    __________

Constant    constant         54702     1         54702        1079.2    1.1897e-14
Constant    Gender          2251.7     1        2251.7        44.425    1.0693e-05
Constant    Error            709.6    14        50.685
Time        constant        310.83     1        310.83        31.023    6.9065e-05
Time        Gender          13.341     1        13.341        1.3315       0.26785
Time        Error           140.27    14        10.019
Time^2      constant        565.42     1        565.42        98.901    1.0003e-07
Time^2      Gender          1.4076     1        1.4076       0.24621       0.62746
Time^2      Error           80.039    14        5.7171
Time^3      constant        2.6127     1        2.6127        1.4318       0.25134
Time^3      Gender      7.8853e-06     1    7.8853e-06    4.3214e-06       0.99837
Time^3      Error           25.546    14        1.8247
Time^4      constant        2.8404     1        2.8404       0.47924       0.50009
Time^4      Gender          2.9016     1        2.9016       0.48956       0.49559
Time^4      Error           82.977    14        5.9269

## Input Arguments

collapse all

Repeated measures model, returned as a RepeatedMeasuresModel object.

For properties and methods of this object, see RepeatedMeasuresModel.

Within-subject model, specified as one of the following:

• 'separatemeans' — The response is the average of the repeated measures (average across the within-subject model).

• 'orthogonalcontrasts' — This is valid when the within-subject model has a single numeric factor T. Responses are the average, the slope of centered T, and, in general, all orthogonal contrasts for a polynomial up to T^(p – 1), where p is the number of rows in the within-subject model. anova multiplies Y, the response you use in the repeated measures model rm by the orthogonal contrasts, and uses the columns of the resulting product matrix as the responses.

anova computes the orthogonal contrasts for T using the Q factor of a QR factorization of the Vandermonde matrix.

• A character vector or string scalar that defines a model specification in the within-subject factors. Responses are defined by the terms in that model. anova multiplies Y, the response matrix you use in the repeated measures model rm by the terms of the model, and uses the columns of the result as the responses.

For example, if there is a Time factor and 'Time' is the model specification, then anova uses two terms, the constant and the uncentered Time term. The default is '1' to perform on the average response.

• An r-by-nc matrix, C, specifying nc contrasts among the r repeated measures. If Y represents the matrix of repeated measures you use in the repeated measures model rm, then the output tbl contains a separate analysis of variance for each column of Y*C.

The anova table contains a separate univariate analysis of variance results for each response.

Example: 'WithinModel','Time'

Example: 'WithinModel','orthogonalcontrasts'

## Output Arguments

collapse all

Results of analysis of variance for between-subject effects, returned as a table. This includes all terms on the between-subjects model and the following columns.

Column NameDefinition
WithinWithin-subject factors
BetweenBetween-subject factors
SumSqSum of squares
DFDegrees of freedom
MeanSqMean squared error
FF-statistic
pValuep-value corresponding to the F-statistic

collapse all

### Vandermonde Matrix

Vandermonde matrix is the matrix where columns are the powers of the vector a, that is, V(i,j) = a(i)(nj), where n is the length of a.

### QR Factorization

QR factorization of an m-by-n matrix A is the factorization that matrix into the product A = Q*R, where R is an m-by-n upper triangular matrix and Q is an m-by-m unitary matrix.

## Version History

Introduced in R2014a