# Filter Data Through State-Space Model in Real Time

This example shows how to nowcast a state-space model. In other words, the example updates the state distribution moments as economic data becomes available, even when individual measurements are released asynchronously.

This example uses a state-space formulation for a dynamic factor model. The model variables include:

• Two states, which compose a vector autoregression of demand and supply.

• Quarterly growth rates of US CPI, GDP, 3-month treasury bill yield, and the unemployment rate. Each measurement is a variable in the observation equation and each is a linear combination of supply and demand perturbed by measurement error.

Symbolically, the state-space formulation is

`$\begin{array}{l}\left[\begin{array}{c}{x}_{1,t}\\ {x}_{2,t}\end{array}\right]=\left[\begin{array}{cc}{\varphi }_{11}& 0\\ {\varphi }_{21}& {\varphi }_{22}\end{array}\right]\left[\begin{array}{c}{x}_{1,t-1}\\ {x}_{2,t-1}\end{array}\right]+\left[\begin{array}{cc}1& 0\\ 0& 1\end{array}\right]\left[\begin{array}{c}{u}_{1,t}\\ {u}_{2,t}\end{array}\right]\\ \left[\begin{array}{c}{y}_{1,t}\\ {y}_{2,t}\\ {y}_{3,t}\\ {y}_{4,t}\end{array}\right]=\left[\begin{array}{cc}{C}_{11}& {C}_{12}\\ {C}_{21}& {C}_{22}\\ {C}_{31}& {C}_{32}\\ {C}_{41}& {C}_{42}\end{array}\right]\left[\begin{array}{c}{x}_{1,t}\\ {x}_{2,t}\end{array}\right]+\left[\begin{array}{cccc}{D}_{11}& 0& 0& 0\\ 0& {D}_{22}& 0& 0\\ 0& 0& {D}_{33}& 0\\ 0& 0& 0& {D}_{44}\end{array}\right]\left[\begin{array}{c}{\epsilon }_{1,t}\\ {\epsilon }_{2,t}\\ {\epsilon }_{3,t}\\ {\epsilon }_{4,t}\end{array}\right].\end{array}$`

To nowcast the model, the example performs the following procedure:

1. Create a state-space model template for estimation.

2. Define a holdout set from the estimation data.

3. Fit the model to the in-sample data.

4. Configure the holdout data set to simulate asynchronously released measurements.

5. Nowcast the estimated model to update the state distribution states.

### Create State-Space Model Estimation Template

To estimate a state-space model, the `estimate` function requires a template specifying the structure of the model, which parameters are unknown or fixed, and other characteristics.

Create a state-space model for the dynamic factor model. Specify NaN values for unknown parameters, and specify that the states are stationary processes by setting their type (`StateType`) to `0`.

```numStates = 2; numSeries = 4; A = tril(nan(numStates)); B = eye(numStates); C = nan(numSeries,numStates); D = diag(NaN(numSeries,1)); Mdl = ssm(A,B,C,D,StateType=[0 0]);```

`Mdl` is an `ssm` object representing a template for estimation.

Load the US macroeconomic data set `Data_USEconModel.mat`. The data contains quarterly measurements of 14 economic series from Q1:1947 through Q1:2009.

`load Data_USEconModel`

`DataTimeTable` is a 249-by-14 timetable that contains the data and sample timestamps. For more details, enter `Description` at the command line.

Extract the CPI, GDP, 3-month treasury bill yield, and the unemployment rate.

```DataLevel = DataTimeTable{:,["CPIAUCSL" "GDP" "TB3MS" "UNRATE"]}; anyMissing = any(any(ismissing(DataLevel)))```
```anyMissing = logical 1 ```

`DataLevel` is a 249-by-4 numeric matrix of the measurements. At least one series contains missing values, but the Kalman filter accommodates missing observations by rolling forward the current state distribution as the next state forecast.

Convert the levels to returns by passing the data to `price2ret`.

`DataGrowth = price2ret(DataLevel)*100;`

Because the transformation uses the first difference, `DataGrowth` has one less observation than `DataLevel`.

Reserve the final 3 years (12 quarters) of data as a holdout sample. Extract corresponding dates.

```numPeriods = 12; Y = DataGrowth(1:(end-numPeriods),:); % In-sample data Yf = DataGrowth((end-numPeriods+1):end,:); % Holdout sample T = size(DataGrowth,1); dates = DataTimeTable.Time(2:end-numPeriods); datesf = DataTimeTable.Time(end-numPeriods+1:end);```

### Estimate Model

The estimation procedure requires initial values for all unknown parameters. Create a vector of containing initial values. For example, if `A0`, `C0`, and `D0` are matrices containing initial values for `A`, `B`, and `C`, arrange the initial values by creating the vector `[A0(:,1); A0(:,2); C0(:,1); C0(:,2); D0(:,1); D0(:,2); D0(:,3); D0(:,4)]`. Only unknown parameters require initial values.

`params0 = [0.8; 0; 0.5; ones(numStates*numSeries,1); 0.5*std(Y,"omitNan")'];`

Fit the model to the in-sample data. Specify the initial values and the univariate treatment of the multivariate model. Turn off the estimation summary.

`EstMdl = estimate(Mdl,Y,params0,Univariate=1,Display="off")`
```EstMdl = State-space model type: ssm State vector length: 2 Observation vector length: 4 State disturbance vector length: 2 Observation innovation vector length: 4 Sample size supported by model: Unlimited State variables: x1, x2,... State disturbances: u1, u2,... Observation series: y1, y2,... Observation innovations: e1, e2,... State equations: x1(t) = (0.60)x1(t-1) + u1(t) x2(t) = -(0.23)x1(t-1) + (0.99)x2(t-1) + u2(t) Observation equations: y1(t) = -(0.07)x1(t) + (0.12)x2(t) + (0.61)e1(t) y2(t) = -(0.70)x1(t) + (0.18)x2(t) + (0.63)e2(t) y3(t) = -(6.87)x1(t) + (0.07)x2(t) + (16.18)e3(t) y4(t) = (5.30)x1(t) + (0.07)x2(t) + (3.96)e4(t) Initial state distribution: Initial state means x1 x2 0 0 Initial state covariance matrix x1 x2 x1 1.57 -0.53 x2 -0.53 76.34 State types x1 x2 Stationary Stationary ```

Check whether the estimated VAR state model is stable.

```varlagop = LagOp({eye(numStates) EstMdl.A}); isStable(varlagop)```
```ans = logical 1 ```

The VAR state model is stable.

### Simulate Asynchronous Release of Measurements

To simulate the asynchronous release of economic measurements within a quarter, suppose each measurement is released in successive weeks. For observation `t` in the holdout sample:

1. Create a `numSeries`-by-`numSeries` matrix of `NaN` values.

2. Replace the diagonal elements of the NaN matrix with the observations of period `t`.

Stack each matrix.

```YfStretch = nan(numPeriods*numSeries,numSeries); for t = 1:numPeriods for j = 1:numSeries YfStretch((t-1)*numSeries + j,j) = Yf(t,j); end end```

`YfStretch` is a `numPeriods*numSeries`-by-`numSeries` (48-by-4) block matrix of asynchronously released measurements over the holdout sample period. For example, block 1 is `YfStretch(1:4,:)` and it represents the first observation in the holdout sample `Yf(1,:)` where the four measurements are released at different times or subperiods (rows), block 2 is `YfStretch(5:8,:)` and it represents the second observation in the holdout sample `Yf(2,:)` where the four measurement are released at different times, and so on.

The estimated model `EstMdl` is calibrated so that states transition each period (row in the data). States in the holdout sample must transition only after processing the incoming unemployment growth rate. To achieve this action, create a model for the subperiod observations so that the states remain the same as it processes an observation. Symbolically,

`$\begin{array}{l}\left[\begin{array}{c}{x}_{1,t}\\ {x}_{2,t}\end{array}\right]=\left[\begin{array}{cc}1& 0\\ 0& 1\end{array}\right]\left[\begin{array}{c}{x}_{1,t-1}\\ {x}_{2,t-1}\end{array}\right]\\ \left[\begin{array}{c}{y}_{1,t}\\ {y}_{2,t}\\ {y}_{3,t}\\ {y}_{4,t}\end{array}\right]=\left[\begin{array}{cc}{\underset{}{\overset{ˆ}{C}}}_{11}& {\underset{}{\overset{ˆ}{C}}}_{12}\\ {\underset{}{\overset{ˆ}{C}}}_{21}& {\underset{}{\overset{ˆ}{C}}}_{22}\\ {\underset{}{\overset{ˆ}{C}}}_{31}& {\underset{}{\overset{ˆ}{C}}}_{32}\\ {\underset{}{\overset{ˆ}{C}}}_{41}& {\underset{}{\overset{ˆ}{C}}}_{42}\end{array}\right]\left[\begin{array}{c}{x}_{1,t}\\ {x}_{2,t}\end{array}\right]+\left[\begin{array}{cccc}{\underset{}{\overset{ˆ}{D}}}_{11}& 0& 0& 0\\ 0& {\underset{}{\overset{ˆ}{D}}}_{22}& 0& 0\\ 0& 0& {\underset{}{\overset{ˆ}{D}}}_{33}& 0\\ 0& 0& 0& {\underset{}{\overset{ˆ}{D}}}_{44}\end{array}\right]\left[\begin{array}{c}{\epsilon }_{1,t}\\ {\epsilon }_{2,t}\\ {\epsilon }_{3,t}\\ {\epsilon }_{4,t}\end{array}\right],\end{array}$`

with a degenerate initial state distribution.

```EstMdlAux = ssm(eye(numStates),zeros(numStates),EstMdl.C,EstMdl.D,... Mean0=zeros(numStates,1),Cov0=zeros(numStates));```

### Nowcast the Model

Obtain state distribution moments at the end of the in-sample period.

```[states,~,output] = filter(EstMdl,Y); currentState = states(end,:)'; CurrentStateCov = output(end).FilteredStatesCov;```

Preallocate arrays for the nowcasts and dates for the stretched periods.

```newStateF = nan(numPeriods*numSeries,numStates); NewStateCovF = cell(numPeriods*numSeries,1); datesfStretch = repmat(dates(end),numPeriods*numSeries,1);```

Nowcast the model by using this procedure for each block matrix in the stretched holdout sample:

1. When a CPI measurement is released, update the current state distribution moments by filtering the measurement through the estimated model `EstMdl`.

2. Suppose subsequent measurements are released one week apart from each other. When each other measurement is released, update the current state distribution moments by filtering the measurement through the estimated model `EstMdlAux`.

3. Replace the current state distribution with the updated distribution after each iteration.

```tStretch = 0; % Index of stretched forecast horizon wk = 0; for t = 1:numPeriods for j = 1:numSeries tStretch = tStretch + 1; if j == 1 % Advance to next full period [newStateF(tStretch,:),NewStateCovF{tStretch}] = update(EstMdl,YfStretch(tStretch,:), ... currentState,CurrentStateCov); wk = 0; datesfStretch(tStretch) = datesf(t); else % Stretch current period [newStateF(tStretch,:),NewStateCovF{tStretch}] = update(EstMdlAux,YfStretch(tStretch,:), ... currentState,CurrentStateCov); wk = wk + 1; datesfStretch(tStretch) = datesf(t) + wk; end currentState = newStateF(tStretch,:)'; CurrentStateCov = NewStateCovF{tStretch}; end end```

`newStateF` is a 48-by-2 block matrix of nowcasted demands and supplies during the subperiods of the holdout period. `NewStateCovF` is a 48-by-1 cell array of 2-by-2 covariance matrices for each nowcast.

Plot the in-sample demand and supply estimates for periods beyond 1993, and plot the nowcasts.

```dtp = dates > datetime(1994,1,1); figure h = plot(dates(dtp),states(dtp,:),"-",datesfStretch,newStateF,"--"); hold on h(3).Color = h(1).Color; h(4).Color = h(2).Color; yl = ylim; hp = patch([datesfStretch(1) datesfStretch(end) datesfStretch(end) datesfStretch(1)], ... yl([1,1,2,2]),[0.8 0.8 0.8]); uistack(hp,'bottom'); legend([h; hp],... ["Demand (in-sample)"; "Supply (in-sample)"; "Demand (updated)"; "Supply (updated)"; "Forecast horizon"],... Location="best") hold off``` 