manova1

One-way multivariate analysis of variance (MANOVA)

Syntax

d = manova1(X,group)

d = manova1(X,group,alpha)

[d,p] = manova1(___)

[d,p,stats] = manova1(___)

Description

d = manova1(X,group) performs a one-way multivariate analysis of variance (MANOVA) and returns an estimate d for the dimension of the space containing the group means. To perform the MANOVA, manova1 uses the factor in group and the data in X.

d = manova1(X,group,alpha) also specifies the significance level for the MANOVA.

[d,p] = manova1(___) also returns the p-value p corresponding to d, using any of the input argument combinations in the previous syntaxes.

example

[d,p,stats] = manova1(___) also returns a structure stats containing additional MANOVA statistics.

example

Examples

collapse all

Calculate Dimension of Space Containing Mean Vectors

Open Live Script

Load the carbig data set.

load carbig

Calculate the dimension of the space containing the group mean vectors and the corresponding p-values.

[d,p] = manova1([MPG Acceleration Weight Displacement],...
                Origin)

d = 
3

The output shows that enough evidence exists to reject the null hypothesis that the mean vectors are statistically the same. However, not enough evidence exists to reject the null hypothesis that the mean vectors lie in the same 3D space.

Specify Significance Level and Inspect Canonical Variables

Open Live Script

Load the fisheriris data set.

load fisheriris;

The column vector species contains three iris flower species: setosa, versicolor, and virginica. The matrix meas contains four types of measurements for the flower: the length and width of sepals and petals in centimeters.

Perform a one-way MANOVA to test the null hypothesis that the vector of means for the four measurements is the same across the three flower species. Specify the significance level. Calculate the dimension of the space containing the vectors for the three flower species, the corresponding p-values, and additional statistics for the MANOVA.

[d,p,stats] = manova1(meas,species,0.01)

d = 
2

p = 2×1
10^-7 ×

    0.0000
    0.5786

stats = struct with fields:
           W: [4×4 double]
           B: [4×4 double]
           T: [4×4 double]
         dfW: 147
         dfB: 2
         dfT: 149
      lambda: [2×1 double]
       chisq: [2×1 double]
     chisqdf: [2×1 double]
    eigenval: [4×1 double]
    eigenvec: [4×4 double]
       canon: [150×4 double]
       mdist: [150×1 double]
      gmdist: [3×3 double]
      gnames: {3×1 cell}

The output shows that the vectors of means for the three species are contained in a two-dimensional space. This result indicates that one of the vectors is statistically different from the others. The stats structure contains additional statistics for the MANOVA.

Inspect the canonical response data for the MANOVA.

C = stats.canon

C = 150×4

   -8.0618    0.3004    0.1583   -0.2290
   -7.1287   -0.7867    0.7466    0.4909
   -7.4898   -0.2654   -0.0957    0.5471
   -6.8132   -0.6706   -0.6908    0.4402
   -8.1323    0.5145   -0.3944   -0.2742
   -7.7019    1.4617   -0.1331   -0.6035
   -7.2126    0.3558   -0.8867    0.6096
   -7.6053   -0.0116   -0.1736   -0.1893
   -6.5606   -1.0152   -0.5653    0.9595
   -7.3431   -0.9473   -0.0253   -0.1415
   -8.3974    0.6474    0.3436   -0.8188
   -7.2193   -0.1096   -1.0583   -0.1948
   -7.3268   -1.0730    0.1690    0.1914
   -7.5725   -0.8055   -0.5953    0.9841
   -9.8498    1.5859    1.6503   -1.0085
      ⋮

Each column of C corresponds to a canonical variable, and each row contains a transformed data point corresponding to the same row in X. For more information about canonical variables, see Canonical Variables.

Create a scatter plot using the first and second canonical variables.

gscatter(C(:,1),C(:,2),species)

Figure contains an axes object. The axes object contains 3 objects of type line. One or more of the lines displays its values using only markers These objects represent setosa, versicolor, virginica.

The scatter plot shows two main clusters of data, with the measurements for setosa in one cluster and the measurements for versicolor and virginica in the other. This result also shows that the vectors of means for the three species are contained in a two-dimensional space.

Input Arguments

collapse all

`X` — Data
numeric matrix

Data, specified as a numeric matrix with n rows, where n is the number of observations. The columns of X correspond to the elements of the multivariate means.

Data Types: single | double

`group` — Factor values
categorical vector | numeric vector | string vector | cell array of character vectors

Factor values, specified as a categorical, numeric, or string vector, or a cell array of character vectors. group must contain n elements, where n is the number of rows in X. Each element of group represents the factor value of the data in the corresponding row of X.

Example: [1,2,1,3,1,...,3,1]

Example: ["white","red","white",...,"black","red"]

Data Types: single | double | string | cell | categorical

`alpha` — Significance level
`0.05` (default) | scalar between 0 and 1

Significance level for the MANOVA, specified as a scalar between 0 and 1. For more information, see Algorithms.

Example: 0.01

Data Types: single | double

Output Arguments

collapse all

`d` — Estimate of dimension
nonnegative scalar

Estimate of the dimension of the space containing the mean vectors, returned as a nonnegative scalar. d is less than or equal to the number of rows in X. For more information, see Algorithms.

`p` — p-values
nonnegative vector

p-values for the MANOVA, returned as a nonnegative vector of length d. p contains a p-value for each dimension manova1 tests when calculating d. For more information, see Algorithms.

Data Types: single | double

`stats` — Additional MANOVA results
structure

Additional MANOVA results, returned as a structure with the following fields.

Field	Contents
`W`	Within-groups sum of squares and cross-products matrix
`B`	Between-groups sum of squares and cross-products matrix
`T`	Total sum of squares and cross-products matrix
`dfW`	Degrees of freedom for `W`
`dfB`	Degrees of freedom for `B`
`dfT`	Degrees of freedom for `T`
`lambda`	Vector of values of the Wilks' lambda test statistic for testing whether the means have dimension 0, 1, and so on.
`chisq`	Transformation of `lambda` to an approximate chi-square distribution
`chisqdf`	Degrees of freedom for `chisq`
`eigenval`	Eigenvalues of W^-1B
`eigenvec`	Eigenvectors of W^-1B, the coefficients for the canonical variables `C` scaled so the within-groups variance of the canonical variables is 1
`canon`	Canonical variables, equal to `XC*eigenvec`, where `XC` is `X` with the columns centered by subtracting their means (see Canonical Variables).
`mdist`	Vector of Mahalanobis distances from each point to the mean of its group
`gmdist`	Matrix of Mahalanobis distances between each pair of group means

Data Types: struct

More About

collapse all

Canonical Variables

The canonical variables canon are linear combinations of the original variables that maximize the separation between groups. canon(:,1) is the linear combination of the X columns that has the maximum separation between groups. Among all possible linear combinations, canon(:,1) has the most significant F-statistic in a one-way analysis of variance (ANOVA). canon(:,2) has the maximum separation subject to it being orthogonal to canon(:,1), and so on.

Algorithms

manova1 determines d by calculating a test statistic for each possible value of d. The formula for the test statistic is

$(n - 1 - \frac{l + r}{2}) \log (λ),$

where n is the number of observations, l is the number of factor levels, r is the number of response variables, and $λ$ is Wilks' lambda. For more information about Wilks' lambda, see Multivariate Analysis of Variance for Repeated Measures.

The largest possible value of d is the minimum between the number of response variables and one less than the number of factor levels. d is the largest value for which the p-value is less than the significance level specified by alpha.

Alternative Functionality

Instead of using manova1, you can create a manova object using the manova function, and then use the barttest object function to calculate the dimension of the space containing the group means. The advantages of using the manova function include:

Support for two-way and N-way MANOVA
Table support for factor and response data
Additional properties of the manova object, including those for the fitted MANOVA model coefficients, degrees of freedom for the error, and response covariance matrix

References

[1] Krzanowski, Wojtek. J. Principles of Multivariate Analysis: A User's Perspective. New York: Oxford University Press, 1988.

[2] Morrison, Donald F. Multivariate Statistical Methods. 2nd ed, McGraw-Hill, 1976.

Version History

Introduced before R2006a

manova1

Syntax

Description

Examples

Calculate Dimension of Space Containing Mean Vectors

Specify Significance Level and Inspect Canonical Variables

Input Arguments

`X` — Data
numeric matrix

`group` — Factor values
categorical vector | numeric vector | string vector | cell array of character vectors

`alpha` — Significance level
`0.05` (default) | scalar between 0 and 1

Output Arguments

`d` — Estimate of dimension
nonnegative scalar

`p` — p-values
nonnegative vector

`stats` — Additional MANOVA results
structure

More About

Canonical Variables

Algorithms

Alternative Functionality

References

Version History

See Also

Topics

manova1

Syntax

Description

Examples

Calculate Dimension of Space Containing Mean Vectors

Specify Significance Level and Inspect Canonical Variables

Input Arguments

X — Data numeric matrix

group — Factor values categorical vector | numeric vector | string vector | cell array of character vectors

alpha — Significance level 0.05 (default) | scalar between 0 and 1

Output Arguments

d — Estimate of dimension nonnegative scalar

p — p-values nonnegative vector

stats — Additional MANOVA results structure

More About

Canonical Variables

Algorithms

Alternative Functionality

References

Version History

See Also

Topics

`X` — Data
numeric matrix

`group` — Factor values
categorical vector | numeric vector | string vector | cell array of character vectors

`alpha` — Significance level
`0.05` (default) | scalar between 0 and 1

`d` — Estimate of dimension
nonnegative scalar

`p` — p-values
nonnegative vector

`stats` — Additional MANOVA results
structure