Accelerating the pace of engineering and science

# Documentation

Augmented Dickey-Fuller test

## Description

example

h = adftest(Y) returns a logical value with the rejection decision from conducting an augmented Dickey-Fuller test for a unit root in a univariate time series, Y.

example

h = adftest(Y,Name,Value) uses additional options specified by one or more Name,Value pair arguments.

• If any Name,Value argument is a vector, then all Name,Value arguments specified must be vectors of equal length or length one. adftest(Y,Name,Value) treats each element of a vector input as a separate test, and returns a vector of rejection decisions.

• If any Name,Value argument is a row vector, then adftest(Y,Name,Value) returns a row vector.

[h,pValue] = adftest(___) returns the rejection decision and p-value for the hypothesis test, using any of the input arguments in the previous syntaxes.

example

[h,pValue,stat,cValue,reg] = adftest(___) additionally returns the test statistic, critical value, and a structure of regression statistics for the hypothesis test.

## Examples

expand all

### Conduct a Dickey-Fuller Test Without Augmentation

Test a time series for a unit root using the default autoregression model without augmented difference terms.

Y = DataTable.INF_C;

Test the time series for a unit root.

h =

0

The result h = 0 indicates that this test fails to reject the null hypothesis of a unit root against the autoregressive alternative.

### Conduct an Augmented Dickey-Fuller Test Against a Trend-Stationary Alternative

Test a time series for a unit root against a trend-stationary alternative augmented with lagged difference terms.

Load a time series of GDP data, and calculate its log.

Y = log(Data);

Test for a unit root against a trend-stationary alternative, augmenting the model with 0, 1, and 2 lagged difference terms.

h =

0     0     0

adftest treats the three lag choices as three separate tests, and returns a vector with rejection decisions for each test. The values h = 0 indicate that all three tests fail to reject the null hypothesis of a unit root against the trend-stationary alternative.

### Choose the Number of Lagged Difference Terms to Include in the Augmented Model

Test a time series for a unit root against trend-stationary alternatives augmented with different numbers of lagged difference terms. Look at the regression statistics corresponding to each of the alternative models to choose how many lagged difference terms to include in the augmented model.

Load a time series of GDP data, and calculate its log.

Y = log(Data);

Test for a unit root using three different choices for the number of lagged difference terms. Return the regression statistics for each alternative model.

adftest treats each of the three lag choices as separate tests, and returns results for each test. reg is an array of three data structures, corresponding to each alternative model.

Display the names of the coefficients included in each of the three alternatives.

reg.names
ans =

'c'
'd'
'a'

ans =

'c'
'd'
'a'
'b1'

ans =

'c'
'd'
'a'
'b1'
'b2'

The output shows which terms are included in the three alternative models. The first model has no added difference terms, the second model has one difference term (b1), and the third model has two difference terms (b1 and b2).

Display the t-statistics and corresponding p-values for each coefficient in the three alternative models.

[reg(1).tStats.t reg(1).tStats.pVal]
[reg(2).tStats.t reg(2).tStats.pVal]
[reg(3).tStats.t reg(3).tStats.pVal]
ans =

2.0533    0.0412
1.8842    0.0608
61.4717    0.0000

ans =

2.9026    0.0041
2.7681    0.0061
64.1396    0.0000
5.6514    0.0000

ans =

3.2568    0.0013
3.1249    0.0020
62.7825    0.0000
4.7586    0.0000
1.7615    0.0795

The returned t-statistics and p-values correspond to the coefficients in reg.names. These results indicate that the coefficient on the first difference term is significantly different from zero in both the second and third models, but the coefficient on the second term in the third model is not. This suggests augmenting the model with one lagged difference term is adequate.

Compare the BIC for each of the three alternatives.

reg.BIC
ans =

-1.4774e+03

ans =

-1.4966e+03

ans =

-1.4878e+03

Based on the BIC values, choose the model augmented with one lagged difference term because it has the best (that is, the smallest) BIC value.

## Input Arguments

expand all

### Y — Univariate time seriescolumn vector

Univariate time series, specified as a column vector. The last element is the most recent observation. adftest ignores missing observations, indicated by NaNs.

Data Types: double

### Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: 'alpha',0.1,'lags',0:2 specifies three tests with 0, 1, and 2 lagged difference terms conducted at the 0.1 significance level

### 'alpha' — Significance levels0.05 (default) | scalar | vector

Significance levels for the hypothesis tests, specified as the comma-separated pair consisting of 'alpha' and a scalar or vector. Use a vector to conduct multiple tests. All values of alpha must be between 0.001 and 0.999.

Example: 'alpha',0.01

Data Types: double

### 'lags' — Number of lagged difference terms0 (default) | nonnegative integer | vector of nonnegative integers

Number of lagged difference terms to include in the model, specified as the comma-separated pair consisting of 'lags' and a nonnegative integer or vector of nonnegative integers. Use a vector to conduct multiple tests.

Example: 'lags',[0,1,2]

Data Types: double

### 'model' — Model variant'AR' (default) | 'ARD' | 'TS'

Model variant, specified as the comma-separated pair consisting of 'model' and a string or cell array of strings. Use a cell array of strings to conduct multiple tests with different model variants. The possible values are:

 'AR' Autoregressive model variant, which specifies a test of the null model${y}_{t}={y}_{t-1}+{\beta }_{1}\Delta {y}_{t-1}+{\beta }_{2}\Delta {y}_{t-2}+\dots +{\beta }_{p}\Delta {y}_{t-p}+{\epsilon }_{t}$against the alternative model${y}_{t}=\varphi {y}_{t-1}+{\beta }_{1}\Delta {y}_{t-1}+{\beta }_{2}\Delta {y}_{t-2}+\dots +{\beta }_{p}\Delta {y}_{t-p}+{\epsilon }_{t},$with AR(1) coefficient, $\varphi <1.$ 'ARD' Autoregressive model with drift variant, which specifies a test of the null model${y}_{t}={y}_{t-1}+{\beta }_{1}\Delta {y}_{t-1}+{\beta }_{2}\Delta {y}_{t-2}+\dots +{\beta }_{p}\Delta {y}_{t-p}+{\epsilon }_{t}$ against the alternative model${y}_{t}=c+\varphi {y}_{t-1}+{\beta }_{1}\Delta {y}_{t-1}+{\beta }_{2}\Delta {y}_{t-2}+\dots +{\beta }_{p}\Delta {y}_{t-p}+{\epsilon }_{t},$with drift coefficient, c, and AR(1) coefficient, $\varphi <1.$ 'TS' Trend-stationary model variant, which specifies a test of the null model${y}_{t}=c+{y}_{t-1}+{\beta }_{1}\Delta {y}_{t-1}+{\beta }_{2}\Delta {y}_{t-2}+\dots +{\beta }_{p}\Delta {y}_{t-p}+{\epsilon }_{t}$ against the alternative model${y}_{t}=c+\delta t+\varphi {y}_{t-1}+{\beta }_{1}\Delta {y}_{t-1}+{\beta }_{2}\Delta {y}_{t-2}+\dots +{\beta }_{p}\Delta {y}_{t-p}+{\epsilon }_{t},$ with drift coefficient, c, deterministic trend coefficient, δ, and AR(1) coefficient, $\varphi <1.$

Example: 'model',{'AR','ARD'}

### 'test' — Test statistic't1' (default) | 't2' | 'F'

Test statistic, specified as the comma-separated pair consisting of 'test' and a string or cell array of strings with these possible values:

 't1' Standard t statistic,${t}_{1}=\left(\stackrel{^}{\varphi }-1\right)}{se},$computed using the OLS estimate of the AR(1) coefficient, $\stackrel{^}{\varphi },$ and its standard error (se), in the alternative model.The test assesses the significance of the restriction, $\varphi -1=0.$ 't2' Lag-adjusted, unstudentized t statistic,${t}_{2}=N\left(\stackrel{^}{\varphi }-1\right)}{\left(1-{\stackrel{^}{\beta }}_{1}-\dots -{\stackrel{^}{\beta }}_{p}\right)},$computed using the OLS estimates of the AR(1) coefficient and stationary coefficients in the alternative model. N is the effective sample size, adjusted for lags and missing values.The test assesses the significance of the restriction, $\varphi -1=0.$ 'F' F statistic for assessing the significance of a joint restriction on the alternative model.For model variant 'ARD', the restrictions are$\varphi -1=0$ and c = 0.For model variant 'TS', the restrictions are$\varphi -1=0$ and δ = 0.An F statistic is invalid for model variant 'AR'.

Use a cell array of strings to conduct multiple tests using different test statistics.

Example: 'test','t2'

## Output Arguments

expand all

### h — Test rejection decisionslogical | vector of logicals

Test rejection decisions, returned as a logical value or vector of logical values with length equal to the number of tests conducted.

• h = 1 indicates rejection of the unit-root null in favor of the alternative model.

• h = 0 indicates failure to reject the unit-root null.

### pValue — Test statistic p-valuesscalar | vector

Test statistic p-values, returned as a scalar or vector with length equal to the number of tests conducted.

• If the test statistic is 't1' or 't2', then the p-values are left-tail probabilities.

• If the test statistic is 'F', then the p-values are right-tail probabilities.

### stat — Test statisticsscalar | vector

Test statistics, returned as a scalar or vector with length equal to the number of tests conducted. adftest computes test statistics using ordinary least squares (OLS) estimates of the coefficients in the alternative model.

### cValue — Critical valuesscalar | vector

Critical values, returned as a scalar or vector with length equal to the number of tests conducted.

• If the test statistic is 't1' or 't2', then the critical values are for left-tail probabilities.

• If the test statistic is 'F', then the critical values are for right-tail probabilities.

### reg — Regression statisticsdata structure | data structure array

Regression statistics for ordinary least squares (OLS) estimation of coefficients in the alternative model, returned as a data structure or data structure array with length equal to the number of tests conducted.

Each data structure has the following fields.

FieldDescription
numLength of input series with NaNs removed
sizeEffective sample size, adjusted for lags
namesRegression coefficient names
coeffEstimated coefficient values
seEstimated coefficient standard errors
CovEstimated coefficient covariance matrix
tStatst statistics of coefficients and p-values
FStatF statistic and p-value
yMuMean of the lag-adjusted input series
ySigmaStandard deviation of the lag-adjusted input series
yHatFitted values of the lag-adjusted input series
resRegression residuals
DWStatDurbin-Watson statistic
SSRRegression sum of squares
SSEError sum of squares
SSTTotal sum of squares
MSEMean square error
RMSEStandard error of the regression
RSqR2 statistic
LLLoglikelihood of data under Gaussian innovations
AICAkaike information criterion
BICBayesian (Schwarz) information criterion
HQCHannan-Quinn information criterion

expand all

### Augmented Dickey-Fuller Test for a Unit Root

The Augmented Dickey-Fuller test for a unit root assesses the null hypothesis of a unit root using the model

${y}_{t}=c+\delta t+\varphi {y}_{t-1}+{\beta }_{1}\Delta {y}_{t-1}+\dots +{\beta }_{p}\Delta {y}_{t-p}+{\epsilon }_{t},$

where

• Δ is the differencing operator, such that $\Delta {y}_{t}={y}_{t}-{y}_{t-1}.$

• The number of lagged difference terms, p, is user specified.

• εt is a mean zero innovation process.

The null hypothesis of a unit root is

${H}_{0}:\varphi =1.$

Under the alternative hypothesis, $\varphi <1.$

Variants of the model allow for different growth characteristics. The model with δ = 0 has no trend component, and the model with c = 0 and δ = 0 has no drift or trend.

A test that fails to reject the null hypothesis, fails to reject the possibility of a unit root.

### Algorithms

• adftest performs ordinary least squares (OLS) regression to estimate the coefficients in the alternative model.

• Dickey-Fuller statistics follow nonstandard distributions under the null hypothesis (even asymptotically). Critical values for a range of sample sizes and significance levels have been tabulated using Monte Carlo simulations of the null model with Gaussian innovations, with five million replications per sample size.

• For small samples, the tabulated critical values are only valid for Gaussian innovations. For large samples, the tabulated values are still valid for non-Gaussian innovations.

• adftest interpolates critical values and p-values from the tables. The tables for test types 't1' and 't2' are identical to those for pptest.