Sequence Distance

조회 수: 3 (최근 30일)
Talha
Talha 2011년 7월 20일
I am sort of confused on how matlab gets its answers for various distance methods. My boss wants to know how matlab arrives at various answers.
I set up matlab to give me answers in fractions, so when I analyze two sequences of the same length, it gives me the denominator of the fraction to be length of the sequences (for example, if both amino acid sequences have a length of 327 then the answer has a denominator of 327). I understood this until when I analyzed two amino acid sequences with each having a different length, one being 369 amino acids long, and another being 379 amino acids long. It gave me the answer: 209/398. I don't understand how it got to having a denominator of 398 (I specifically asked it to use p-distance). When I type in "help seqpdist", it does not give me very clear explanation on how the p-distance works.
So can some one please help me out? I would greatly appreciate it!

답변 (1개)

Lucio Cetto
Lucio Cetto 2011년 7월 20일
When you are comparing sequences it is common to first align them using a dynamic programing algorithm. SEQPDIST uses NWALIGN to pair-wise align all possible pairs of sequences and then takes the measure from the alignment.
Consider:
seqpdist({'AACGT','AAGT','AAT'},'alpha','nt','square',1,'method','p-dist')
The alignment between 1 and 2 is 'AACGT' and 'AA-GT' =>1/5
The alignment between 1 and 3 is 'AACGT' and 'AA--T' =>2/5
The alignment between 1 and 3 is 'AAGT' and 'AA-T' =>1/4
HTH

카테고리

Help CenterFile Exchange에서 Genomics and Next Generation Sequencing에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by