주요 콘텐츠

dnds

Estimate synonymous and nonsynonymous substitution rates

    Description

    [Dn,Ds,Vardn,Vards] = dnds(SeqNT1,SeqNT2) estimates the synonymous and nonsynonymous substitution rates per site between the two homologous nucleotide sequences SeqNT1 and SeqNT2 by comparing codons using the Nei-Gojobori method.

    This analysis:

    • Assumes that the nucleotide sequences SeqNT2 and SeqNT2 are codon-aligned; that is, they do not have frame shifts

    • Excludes codons that include ambiguous nucleotide characters or gaps

    • Considers the number of codons in the shorter of the two nucleotide sequences

    Caution

    If SeqNT1 and SeqNT2 are too short or too divergent, saturation can be reached, and dnds returns NaNs and a warning message.

    example

    [Dn,Ds,Vardn,Vards] = dnds(SeqNT1,SeqNT2,Name=Value) specifies options using one or more name-value arguments. For example, to use the Li-Wu-Luo method to calculate the substitution rate, use the command dnds(SeqNT1,SeqNT2,Method="LWL").

    Examples

    collapse all

    This example shows how to estimate synonymous and nonsynonymous substitution rates between two nucleotide sequences that are not codon-aligned.

    This example uses two nucleotide sequences representing the human HEXA gene (accession number NM_000520) and mouse HEXA gene (accession number AK080777).

    If you have live internet connection, you can use the getgenbank function to retrieve the sequence information from the NCBI data repository and load the data into MATLAB®.

    humanHEXA = getgenbank('NM_000520');
    mouseHEXA = getgenbank('AK080777');
    

    For your convenience, MATLAB provides these two sequences in hexosaminidase.mat. Note that data in public databases are frequently updated and curated, and the results in this example may slightly differ if you use the latest data. Load the sequences.

    load hexosaminidase.mat

    Extract the coding regions from the two nucleotide sequences.

    humanHEXA_cds = featureparse(humanHEXA,'feature','CDS','Sequence',true);
    mouseHEXA_cds = featureparse(mouseHEXA,'feature','CDS','Sequence',true);

    Align the amino acid sequences encoded by the nucleotide sequences.

    [sc,al] = nwalign(nt2aa(humanHEXA_cds),nt2aa(mouseHEXA_cds),'extendgap',1);

    Use the seqinsertgaps function to copy the gaps from the aligned amino acid sequences to their corresponding nucleotide sequences, thus codon-aligning them.

    humanHEXA_aligned = seqinsertgaps(humanHEXA_cds,al(1,:))
    humanHEXA_aligned = 
    'atgacaagctccaggctttggttttcgctgctgctggcggcagcgttcgcaggacgggcgacggccctctggccctggcctcagaacttccaaacctccgaccagcgctacgtcctttacccgaacaactttcaattccagtacgatgtcagctcggccgcgcagcccggctgctcagtcctcgacgaggccttccagcgctatcgtgacctgcttttcggttccgggtcttggccccgtccttacctcacagggaaacggcatacactggagaagaatgtgttggttgtctctgtagtcacacctggatgtaaccagcttcctactttggagtcagtggagaattataccctgaccataaatgatgaccagtgtttactcctctctgagactgtctggggagctctccgaggtctggagacttttagccagcttgtttggaaatctgctgagggcacattctttatcaacaagactgagattgaggactttccccgctttcctcaccggggcttgctgttggatacatctcgccattacctgccactctctagcatcctggacactctggatgtcatggcgtacaataaattgaacgtgttccactggcatctggtagatgatccttccttcccatatgagagcttcacttttccagagctcatgagaaaggggtcctacaaccctgtcacccacatctacacagcacaggatgtgaaggaggtcattgaatacgcacggctccggggtatccgtgtgcttgcagagtttgacactcctggccacactttgtcctggggaccaggtatccctggattactgactccttgctactctgggtctgagccctctggcacctttggaccagtgaatcccagtctcaataatacctatgagttcatgagcacattcttcttagaagtcagctctgtcttcccagatttttatcttcatcttggaggagatgaggttgatttcacctgctggaagtccaacccagagatccaggactttatgaggaagaaaggcttcggtgaggacttcaagcagctggagtccttctacatccagacgctgctggacatcgtctcttcttatggcaagggctatgtggtgtggcaggaggtgtttgataataaagtaaagattcagccagacacaatcatacaggtgtggcgagaggatattccagtgaactatatgaaggagctggaactggtcaccaaggccggcttccgggcccttctctctgccccctggtacctgaaccgtatatcctatggccctgactggaaggatttctacatagtggaacccctggcatttgaaggtacccctgagcagaaggctctggtgattggtggagaggcttgtatgtggggagaatatgtggacaacacaaacctggtccccaggctctggcccagagcaggggctgttgccgaaaggctgtggagcaacaagttgacatctgacctgacatttgcctatgaacgtttgtcacacttccgctgtgaattgctgaggcgaggtgtccaggcccaacccctcaatgtaggcttctgtgagcaggagtttgaacagacctga'
    
    mouseHEXA_aligned = seqinsertgaps(mouseHEXA_cds,al(3,:))
    mouseHEXA_aligned = 
    'atggccggctgcaggctctgggtttcgctgctgctggcggcggcgttggcttgcttggccacggcactgtggccgtggccccagtacatccaaacctaccaccggcgctacaccctgtaccccaacaacttccagttccggtaccatgtcagttcggccgcgcaggcgggctgcgtcgtcctcgacgaggcctttcgacgctaccgtaacctgctcttcggttccggctcttggccccgacccagcttctcaaataaacagcaaacgttggggaagaacattctggtggtctccgtcgtcacagctgaatgtaatgaatttcctaatttggagtcggtagaaaattacaccctaaccattaatgatgaccagtgtttactcgcctctgagactgtctggggcgctctccgaggtctggagactttcagtcagcttgtttggaaatcagctgagggcacgttctttatcaacaagacaaagattaaagactttcctcgattccctcaccggggcgtactgctggatacatctcgccattacctgccattgtctagcatcctggatacactggatgtcatggcatacaataaattcaacgtgttccactggcacttggtggacgactcttccttcccatatgagagcttcactttcccagagctcaccagaaaggggtccttcaaccctgtcactcacatctacacagcacaggatgtgaaggaggtcattgaatacgcaaggcttcggggtatccgtgtgctggcagaatttgacactcctggccacactttgtcctgggggccaggtgcccctgggttattaacaccttgctactctgggtctcatctctctggcacatttggaccggtgaaccccagtctcaacagcacctatgacttcatgagcacactcttcctggagatcagctcagtcttcccggacttttatctccacctgggaggggatgaagtcgacttcacctgctggaagtccaaccccaacatccaggccttcatgaagaaaaagggcttt---actgacttcaagcagctggagtccttctacatccagacgctgctggacatcgtctctgattatgacaagggctatgtggtgtggcaggaggtatttgataataaagtgaaggttcggccagatacaatcatacaggtgtggcgggaagaaatgccagtagagtacatgttggagatgcaagatatcaccagggctggcttccgggccctgctgtctgctccctggtacctgaaccgtgtaaagtatggccctgactggaaggacatgtacaaagtggagcccctggcgtttcatggtacgcctgaacagaaggctctggtcattggaggggaggcctgtatgtggggagagtatgtggacagcaccaacctggtccccagactctggcccagagcgggtgccgtcgctgagagactgtggagcagtaacctgacaactaatatagactttgcctttaaacgtttgtcgcatttccgttgtgagctggtgaggagaggaatccaggcccagcccatcagtgtaggctgctgtgagcaggagtttgagcagacttga'
    

    Estimate the synonymous and nonsynonymous substitutions rates of the codon-aligned nucleotide sequences and also display the codons considered in the computations and their amino acid translations.

    [Dn,Ds,Vardn,Vards] = dnds(humanHEXA_aligned,mouseHEXA_aligned,Verbose=true)
    DNDS: 
    Codons considered in the computations:
    ATGACAAGCTCCAGGCTTTGGTTTTCGCTGCTGCTGGCGGCAGCGTTCGCAGGACGGGCGACGGCCCTCTGGCCCTGGCCTCAGAACTTCCAAACCTCCGACCAGCGCTACGTCCTTTACCCGAACAACTTTCAATTCCAGTACGATGTCAGCTCGGCCGCGCAGCCCGGCTGCTCAGTCCTCGACGAGGCCTTCCAGCGCTATCGTGACCTGCTTTTCGGTTCCGGGTCTTGGCCCCGTCCTTACCTCACAGGGAAACGGCATACACTGGAGAAGAATGTGTTGGTTGTCTCTGTAGTCACACCTGGATGTAACCAGCTTCCTACTTTGGAGTCAGTGGAGAATTATACCCTGACCATAAATGATGACCAGTGTTTACTCCTCTCTGAGACTGTCTGGGGAGCTCTCCGAGGTCTGGAGACTTTTAGCCAGCTTGTTTGGAAATCTGCTGAGGGCACATTCTTTATCAACAAGACTGAGATTGAGGACTTTCCCCGCTTTCCTCACCGGGGCTTGCTGTTGGATACATCTCGCCATTACCTGCCACTCTCTAGCATCCTGGACACTCTGGATGTCATGGCGTACAATAAATTGAACGTGTTCCACTGGCATCTGGTAGATGATCCTTCCTTCCCATATGAGAGCTTCACTTTTCCAGAGCTCATGAGAAAGGGGTCCTACAACCCTGTCACCCACATCTACACAGCACAGGATGTGAAGGAGGTCATTGAATACGCACGGCTCCGGGGTATCCGTGTGCTTGCAGAGTTTGACACTCCTGGCCACACTTTGTCCTGGGGACCAGGTATCCCTGGATTACTGACTCCTTGCTACTCTGGGTCTGAGCCCTCTGGCACCTTTGGACCAGTGAATCCCAGTCTCAATAATACCTATGAGTTCATGAGCACATTCTTCTTAGAAGTCAGCTCTGTCTTCCCAGATTTTTATCTTCATCTTGGAGGAGATGAGGTTGATTTCACCTGCTGGAAGTCCAACCCAGAGATCCAGGACTTTATGAGGAAGAAAGGCTTCGAGGACTTCAAGCAGCTGGAGTCCTTCTACATCCAGACGCTGCTGGACATCGTCTCTTCTTATGGCAAGGGCTATGTGGTGTGGCAGGAGGTGTTTGATAATAAAGTAAAGATTCAGCCAGACACAATCATACAGGTGTGGCGAGAGGATATTCCAGTGAACTATATGAAGGAGCTGGAACTGGTCACCAAGGCCGGCTTCCGGGCCCTTCTCTCTGCCCCCTGGTACCTGAACCGTATATCCTATGGCCCTGACTGGAAGGATTTCTACATAGTGGAACCCCTGGCATTTGAAGGTACCCCTGAGCAGAAGGCTCTGGTGATTGGTGGAGAGGCTTGTATGTGGGGAGAATATGTGGACAACACAAACCTGGTCCCCAGGCTCTGGCCCAGAGCAGGGGCTGTTGCCGAAAGGCTGTGGAGCAACAAGTTGACATCTGACCTGACATTTGCCTATGAACGTTTGTCACACTTCCGCTGTGAATTGCTGAGGCGAGGTGTCCAGGCCCAACCCCTCAATGTAGGCTTCTGTGAGCAGGAGTTTGAACAGACC
    ATGGCCGGCTGCAGGCTCTGGGTTTCGCTGCTGCTGGCGGCGGCGTTGGCTTGCTTGGCCACGGCACTGTGGCCGTGGCCCCAGTACATCCAAACCTACCACCGGCGCTACACCCTGTACCCCAACAACTTCCAGTTCCGGTACCATGTCAGTTCGGCCGCGCAGGCGGGCTGCGTCGTCCTCGACGAGGCCTTTCGACGCTACCGTAACCTGCTCTTCGGTTCCGGCTCTTGGCCCCGACCCAGCTTCTCAAATAAACAGCAAACGTTGGGGAAGAACATTCTGGTGGTCTCCGTCGTCACAGCTGAATGTAATGAATTTCCTAATTTGGAGTCGGTAGAAAATTACACCCTAACCATTAATGATGACCAGTGTTTACTCGCCTCTGAGACTGTCTGGGGCGCTCTCCGAGGTCTGGAGACTTTCAGTCAGCTTGTTTGGAAATCAGCTGAGGGCACGTTCTTTATCAACAAGACAAAGATTAAAGACTTTCCTCGATTCCCTCACCGGGGCGTACTGCTGGATACATCTCGCCATTACCTGCCATTGTCTAGCATCCTGGATACACTGGATGTCATGGCATACAATAAATTCAACGTGTTCCACTGGCACTTGGTGGACGACTCTTCCTTCCCATATGAGAGCTTCACTTTCCCAGAGCTCACCAGAAAGGGGTCCTTCAACCCTGTCACTCACATCTACACAGCACAGGATGTGAAGGAGGTCATTGAATACGCAAGGCTTCGGGGTATCCGTGTGCTGGCAGAATTTGACACTCCTGGCCACACTTTGTCCTGGGGGCCAGGTGCCCCTGGGTTATTAACACCTTGCTACTCTGGGTCTCATCTCTCTGGCACATTTGGACCGGTGAACCCCAGTCTCAACAGCACCTATGACTTCATGAGCACACTCTTCCTGGAGATCAGCTCAGTCTTCCCGGACTTTTATCTCCACCTGGGAGGGGATGAAGTCGACTTCACCTGCTGGAAGTCCAACCCCAACATCCAGGCCTTCATGAAGAAAAAGGGCTTTACTGACTTCAAGCAGCTGGAGTCCTTCTACATCCAGACGCTGCTGGACATCGTCTCTGATTATGACAAGGGCTATGTGGTGTGGCAGGAGGTATTTGATAATAAAGTGAAGGTTCGGCCAGATACAATCATACAGGTGTGGCGGGAAGAAATGCCAGTAGAGTACATGTTGGAGATGCAAGATATCACCAGGGCTGGCTTCCGGGCCCTGCTGTCTGCTCCCTGGTACCTGAACCGTGTAAAGTATGGCCCTGACTGGAAGGACATGTACAAAGTGGAGCCCCTGGCGTTTCATGGTACGCCTGAACAGAAGGCTCTGGTCATTGGAGGGGAGGCCTGTATGTGGGGAGAGTATGTGGACAGCACCAACCTGGTCCCCAGACTCTGGCCCAGAGCGGGTGCCGTCGCTGAGAGACTGTGGAGCAGTAACCTGACAACTAATATAGACTTTGCCTTTAAACGTTTGTCGCATTTCCGTTGTGAGCTGGTGAGGAGAGGAATCCAGGCCCAGCCCATCAGTGTAGGCTGCTGTGAGCAGGAGTTTGAGCAGACT
    Translations:
    M  T  S  S  R  L  W  F  S  L  L  L  A  A  A  F  A  G  R  A  T  A  L  W  P  W  P  Q  N  F  Q  T  S  D  Q  R  Y  V  L  Y  P  N  N  F  Q  F  Q  Y  D  V  S  S  A  A  Q  P  G  C  S  V  L  D  E  A  F  Q  R  Y  R  D  L  L  F  G  S  G  S  W  P  R  P  Y  L  T  G  K  R  H  T  L  E  K  N  V  L  V  V  S  V  V  T  P  G  C  N  Q  L  P  T  L  E  S  V  E  N  Y  T  L  T  I  N  D  D  Q  C  L  L  L  S  E  T  V  W  G  A  L  R  G  L  E  T  F  S  Q  L  V  W  K  S  A  E  G  T  F  F  I  N  K  T  E  I  E  D  F  P  R  F  P  H  R  G  L  L  L  D  T  S  R  H  Y  L  P  L  S  S  I  L  D  T  L  D  V  M  A  Y  N  K  L  N  V  F  H  W  H  L  V  D  D  P  S  F  P  Y  E  S  F  T  F  P  E  L  M  R  K  G  S  Y  N  P  V  T  H  I  Y  T  A  Q  D  V  K  E  V  I  E  Y  A  R  L  R  G  I  R  V  L  A  E  F  D  T  P  G  H  T  L  S  W  G  P  G  I  P  G  L  L  T  P  C  Y  S  G  S  E  P  S  G  T  F  G  P  V  N  P  S  L  N  N  T  Y  E  F  M  S  T  F  F  L  E  V  S  S  V  F  P  D  F  Y  L  H  L  G  G  D  E  V  D  F  T  C  W  K  S  N  P  E  I  Q  D  F  M  R  K  K  G  F  E  D  F  K  Q  L  E  S  F  Y  I  Q  T  L  L  D  I  V  S  S  Y  G  K  G  Y  V  V  W  Q  E  V  F  D  N  K  V  K  I  Q  P  D  T  I  I  Q  V  W  R  E  D  I  P  V  N  Y  M  K  E  L  E  L  V  T  K  A  G  F  R  A  L  L  S  A  P  W  Y  L  N  R  I  S  Y  G  P  D  W  K  D  F  Y  I  V  E  P  L  A  F  E  G  T  P  E  Q  K  A  L  V  I  G  G  E  A  C  M  W  G  E  Y  V  D  N  T  N  L  V  P  R  L  W  P  R  A  G  A  V  A  E  R  L  W  S  N  K  L  T  S  D  L  T  F  A  Y  E  R  L  S  H  F  R  C  E  L  L  R  R  G  V  Q  A  Q  P  L  N  V  G  F  C  E  Q  E  F  E  Q  T  
    M  A  G  C  R  L  W  V  S  L  L  L  A  A  A  L  A  C  L  A  T  A  L  W  P  W  P  Q  Y  I  Q  T  Y  H  R  R  Y  T  L  Y  P  N  N  F  Q  F  R  Y  H  V  S  S  A  A  Q  A  G  C  V  V  L  D  E  A  F  R  R  Y  R  N  L  L  F  G  S  G  S  W  P  R  P  S  F  S  N  K  Q  Q  T  L  G  K  N  I  L  V  V  S  V  V  T  A  E  C  N  E  F  P  N  L  E  S  V  E  N  Y  T  L  T  I  N  D  D  Q  C  L  L  A  S  E  T  V  W  G  A  L  R  G  L  E  T  F  S  Q  L  V  W  K  S  A  E  G  T  F  F  I  N  K  T  K  I  K  D  F  P  R  F  P  H  R  G  V  L  L  D  T  S  R  H  Y  L  P  L  S  S  I  L  D  T  L  D  V  M  A  Y  N  K  F  N  V  F  H  W  H  L  V  D  D  S  S  F  P  Y  E  S  F  T  F  P  E  L  T  R  K  G  S  F  N  P  V  T  H  I  Y  T  A  Q  D  V  K  E  V  I  E  Y  A  R  L  R  G  I  R  V  L  A  E  F  D  T  P  G  H  T  L  S  W  G  P  G  A  P  G  L  L  T  P  C  Y  S  G  S  H  L  S  G  T  F  G  P  V  N  P  S  L  N  S  T  Y  D  F  M  S  T  L  F  L  E  I  S  S  V  F  P  D  F  Y  L  H  L  G  G  D  E  V  D  F  T  C  W  K  S  N  P  N  I  Q  A  F  M  K  K  K  G  F  T  D  F  K  Q  L  E  S  F  Y  I  Q  T  L  L  D  I  V  S  D  Y  D  K  G  Y  V  V  W  Q  E  V  F  D  N  K  V  K  V  R  P  D  T  I  I  Q  V  W  R  E  E  M  P  V  E  Y  M  L  E  M  Q  D  I  T  R  A  G  F  R  A  L  L  S  A  P  W  Y  L  N  R  V  K  Y  G  P  D  W  K  D  M  Y  K  V  E  P  L  A  F  H  G  T  P  E  Q  K  A  L  V  I  G  G  E  A  C  M  W  G  E  Y  V  D  S  T  N  L  V  P  R  L  W  P  R  A  G  A  V  A  E  R  L  W  S  S  N  L  T  T  N  I  D  F  A  F  K  R  L  S  H  F  R  C  E  L  V  R  R  G  I  Q  A  Q  P  I  S  V  G  C  C  E  Q  E  F  E  Q  T  
    
    Dn = 
    0.0933
    
    Ds = 
    0.5181
    
    Vardn = 
    1.1542e-04
    
    Vards = 
    0.0029
    

    Input Arguments

    collapse all

    Nucleotide sequence, specified as a character vector, string, or structure with a Sequence field.

    Nucleotide sequence, specified as a character vector, string, or structure with a Sequence field.

    Name-Value Arguments

    collapse all

    Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

    Example: dnds(SeqNT1,SeqNT2,GeneticCode=3)

    Genetic code, specified as an integer representing a code number or a character vector or string with a code name.

    See GeneticCode for code numbers and code names. If you use a code name, you can truncate it to the first two characters. The default is 1 or Standard.

    Example: dnds(SeqNT1,SeqNT2,GeneticCode=3)

    Method for calculating substitution rates, specified as a character vector or string.

    The available methods are:

    • "NG" (default) — The Nei-Gojobori method [2], which uses the number of synonymous and nonsynonymous substitutions and the number of potentially synonymous and nonsynonymous sites. This method is based on the Jukes-Cantor model.

    • "LWL" — The Li-Wu-Luo method [1], which uses the number of transitional and transversional substitutions at three different levels of degeneracy of the genetic code. This method is based on Kimura's two-parameter model.

    • "PBL" — The Pamilo-Bianchi-Li method [5], which is similar to the Li-Wu-Luo method, but with bias correction. Use this method when the number of transitions is much larger than the number of transversions.

    Example: dnds(SeqNT1,SeqNT2,Method="LWL")

    Sliding window size, in codons, specified as an integer. This is used for calculating substitution rates and variances.

    Example: dnds(SeqNT1,SeqNT2,Window=4)

    Flag to exclude stop codons from calculations, specified as a numeric or logical 1 (true) or 0 (false).

    When set to true:

    • Stop codons are excluded from frequency tables.

    • Paths containing stop codons are not counted in the Nei-Gojobori method.

    Example: dnds(SeqNT1,SeqNT2,AdjustStops=false)

    Flag to display codons considered in the computations and their amino acid translations, specified as a numeric or logical 1 (true) or 0 (false).

    Specify true to use this display to manually verify the codon alignment of the two input sequences. The presence of stop codons (*) in the amino acid translation can indicate that SeqNT1 and SeqNT2 are not codon-aligned.

    Example: dnds(SeqNT1,SeqNT2,Verbose=true)

    Output Arguments

    collapse all

    Nonsynonymous substitution rate, returned as a numeric scalar.

    Synonymous substitution rate, returned as a numeric scalar.

    Variance of the nonsynonymous substitution rate, returned as a numeric scalar.

    Variance of the synonymous substitution rate, returned as a numeric scalar.

    Tips

    References

    [1] Li, W. H., C. I. Wu, and C. C. Luo. “A New Method for Estimating Synonymous and Nonsynonymous Rates of Nucleotide Substitution Considering the Relative Likelihood of Nucleotide and Codon Changes.” Molecular Biology and Evolution 2, no. 2 (March 1985): 150–74. https://doi.org/10.1093/oxfordjournals.molbev.a040343.

    [2] Nei, M., and T. Gojobori. “Simple Methods for Estimating the Numbers of Synonymous and Nonsynonymous Nucleotide Substitutions.” Molecular Biology and Evolution 3, no. 5 (September 1986): 418–26. https://doi.org/10.1093/oxfordjournals.molbev.a040410.

    [3] Nei, M, and L Jin. “Variances of the Average Numbers of Nucleotide Substitutions within and between Populations.” Molecular Biology and Evolution 6, no. 3 (May 1989): 290–300. https://doi.org/10.1093/oxfordjournals.molbev.a040547.

    [4] Nei, Masatoshi, and Sudhir Kumar. “Synonymous and Nonsynonymous Nucleotide Substitutions.” In Molecular Evolution and Phylogenetics. Oxford ; New York: Oxford University Press, 2000.

    [5] Pamilo, P., and N. O. Bianchi. “Evolution of the Zfx and Zfy Genes: Rates and Interdependence between the Genes.” Molecular Biology and Evolution 10, no. 2 (March 1993): 271–81. https://doi.org/10.1093/oxfordjournals.molbev.a040003.

    Version History

    Introduced before R2006a