## Documentation |

Hidden Markov model parameter estimates from emissions and states

`[TRANS,EMIS] = hmmestimate(seq,states)hmmestimate(...,'Symbols',SYMBOLS)hmmestimate(...,'Statenames',STATENAMES)hmmestimate(...,'Pseudoemissions',PSEUDOE)hmmestimate(...,'Pseudotransitions',PSEUDOTR)`

`[TRANS,EMIS] = hmmestimate(seq,states)` calculates
the maximum likelihood estimate of the transition, `TRANS`,
and emission, `EMIS`, probabilities of a hidden Markov
model for sequence, `seq`, with known states, `states`.

`hmmestimate(...,'Symbols',SYMBOLS)` specifies
the symbols that are emitted. `SYMBOLS` can be a
numeric array or a cell array of the names of the symbols. The default
symbols are integers 1 through N, where N is the number of possible
emissions.

`hmmestimate(...,'Statenames',STATENAMES)` specifies
the names of the states. `STATENAMES` can be a numeric
array or a cell array of the names of the states. The default state
names are 1 through `M`, where `M` is
the number of states.

`hmmestimate(...,'Pseudoemissions',PSEUDOE)` specifies
pseudocount emission values in the matrix `PSEUDO`.
Use this argument to avoid zero probability estimates for emissions
with very low probability that might not be represented in the sample
sequence. `PSEUDOE` should be a matrix of size *m*-by-*n*,
where *m* is the number of states in the hidden Markov
model and *n* is the number of possible emissions.
If the $$i\to k$$ emission does not occur in `seq`,
you can set `PSEUDOE(i,k)` to be a positive number
representing an estimate of the expected number of such emissions
in the sequence `seq`.

`hmmestimate(...,'Pseudotransitions',PSEUDOTR)` specifies
pseudocount transition values. You can use this argument to avoid
zero probability estimates for transitions with very low probability
that might not be represented in the sample sequence. `PSEUDOTR` should
be a matrix of size *m*-by-*m*,
where *m* is the number of states in the hidden Markov
model. If the $$i\to j$$ transition does
not occur in `states`, you can set `PSEUDOTR(i,j)` to
be a positive number representing an estimate of the expected number
of such transitions in the sequence `states`.

If the probability of a specific transition or emission is very
low, the transition might never occur in the sequence `states`,
or the emission might never occur in the sequence `seq`.
In either case, the algorithm returns a probability of 0 for the given
transition or emission in `TRANS` or `EMIS`.
You can compensate for the absence of transition with the `'Pseudotransitions'` and `'Pseudoemissions'` arguments.
The simplest way to do this is to set the corresponding entry of `PSEUDO` or `PSEUDOTR` to `1`.
For example, if the transition $$i\to j$$ does
not occur in `states`, set `PSEUOTR(i,j)
= 1`. This forces `TRANS(i,j)` to be positive.
If you have an estimate for the expected number of transitions $$i\to j$$ in a sequence of the same length
as `states`, and the actual number of transitions $$i\to j$$ that occur in `seq` is
substantially less than what you expect, you can set `PSEUOTR(i,j)` to
the expected number. This increases the value of `TRANS(i,j)`.
For transitions that do occur in states with the frequency you expect,
set the corresponding entry of `PSEUDOTR` to `0`,
which does not increase the corresponding entry of `TRANS`.

If you do not know the sequence of states, use `hmmtrain` to
estimate the model parameters.

trans = [0.95,0.05; 0.10,0.90]; emis = [1/6 1/6 1/6 1/6 1/6 1/6; 1/10 1/10 1/10 1/10 1/10 1/2]; [seq,states] = hmmgenerate(1000,trans,emis); [estimateTR,estimateE] = hmmestimate(seq,states);

[1] Durbin, R., S. Eddy, A. Krogh, and
G. Mitchison. *Biological Sequence Analysis*.
Cambridge, UK: Cambridge University Press, 1998.

Was this topic helpful?