Align two profiles using Needleman-Wunsch global alignment
Prof
= profalign(Prof1
, Prof2
)
[Prof, H1, H2
] = profalign(Prof1
, Prof2
)
profalign(..., 'ScoringMatrix', ScoringMatrixValue
,
...)
profalign(..., 'GapOpen', {G1Value
, G2Value
},
...)
profalign(..., 'ExtendGap', {E1Value
, E2Value
},
...)
profalign(..., 'ExistingGapAdjust', ExistingGapAdjustValue
,
...)
profalign(..., 'TerminalGapAdjust', TerminalGapAdjustValue
,
...)
profalign(..., 'ShowScore', ShowScoreValue
,
...)
returns
a new profile (Prof
= profalign(Prof1
, Prof2
)Prof
) for the optimal global
alignment of two profiles (Prof1
, Prof2
).
The profiles (Prof1
, Prof2
)
are numeric arrays of size [(4 or 5 or 20 or 21) x Profile
Length]
with counts or weighted profiles. Weighted profiles
are used to down-weight similar sequences and up-weight divergent
sequences. The output profile is a numeric matrix of size [(5
or 21) x New Profile Length]
where the last row represents
gaps. Original gaps in the input profiles are preserved. The output
profile is the result of adding the aligned columns of the input profiles.
[
returns
pointers that indicate how to rearrange the columns of the original
profiles into the new profile.Prof, H1, H2
] = profalign(Prof1
, Prof2
)
profalign(..., '
calls PropertyName
', PropertyValue
,
...)profalign
with optional properties
that use property name/property value pairs. You can specify one or
more properties in any order. Each PropertyName
must
be enclosed in single quotation marks and is case insensitive. These
property name/property value pairs are as follows:
defines the scoring matrix to be used for the alignment.
profalign(..., 'ScoringMatrix', ScoringMatrixValue
,
...)
ScoringMatrixValue
can be either
of the following:
String specifying the scoring matrix to use for the alignment. Choices for amino acid sequences are:
'BLOSUM62'
'BLOSUM30'
increasing by 5
up
to 'BLOSUM90'
'BLOSUM100'
'PAM10'
increasing by 10
up
to 'PAM500'
'DAYHOFF'
'GONNET'
Default is:
'BLOSUM50'
— When AlphabetValue
equals 'AA'
'NUC44'
— When AlphabetValue
equals 'NT'
Note:
The above scoring matrices, provided with the software, also
include a structure containing a scale factor that converts the units
of the output score to bits. You can also use the |
Matrix representing the scoring matrix to use for
the alignment, such as returned by the blosum
, pam
, dayhoff
, gonnet
,
or nuc44
function.
Note: If you use a scoring matrix that you created or was created by one of the above functions, the matrix does not include a scale factor. The output score will be returned in the same units as the scoring matrix. |
Note:
If you need to compile |
profalign(..., 'GapOpen', {
sets the penalties for opening a gap in the first
and second profiles respectively. G1Value
, G2Value
},
...)G1Value
and G2Value
can
be either scalars or vectors. When using a vector, the number of elements
is one more than the length of the input profile. Every element indicates
the position specific penalty for opening a gap between two consecutive
symbols in the sequence. The first and the last elements are the gap
penalties used at the ends of the sequence. The default gap open penalties
are {10,10}
.
profalign(..., 'ExtendGap', {
sets the penalties for extending a gap in the first
and second profile respectively. E1Value
, E2Value
},
...)E1Value
and E2Value
can
be either scalars or vectors. When using a vector, the number of elements
is one more than the length of the input profile. Every element indicates
the position specific penalty for extending a gap between two consecutive
symbols in the sequence. The first and the last elements are the gap
penalties used at the ends of the sequence. If ExtendGap
is
not specified, then extensions to gaps are scored with the same value
as GapOpen
.
profalign(..., 'ExistingGapAdjust',
, if ExistingGapAdjustValue
,
...)ExistingGapAdjustValue
is false
,
turns off the automatic adjustment based on existing gaps of the position-specific
penalties for opening a gap. When ExistingGapAdjustValue
is true
(default),
for every profile position, profalign
proportionally
lowers the penalty for opening a gap toward the penalty of extending
a gap based on the proportion of gaps found in the contiguous symbols
and on the weight of the input profile.
profalign(..., 'TerminalGapAdjust',
, when TerminalGapAdjustValue
,
...)TerminalGapAdjustValue
is true
,
adjusts the penalty for opening a gap at the ends of the sequence
to be equal to the penalty for extending a gap. Default is false
.
profalign(..., 'ShowScore',
, when ShowScoreValue
,
...)ShowScoreValue
is true
,
displays the scoring space and the winning path.
Read in sequences and create profiles.
ma1 = ['RGTANCDMQDA';'RGTAHCDMQDA';'RRRAPCDL-DA']; ma2 = ['RGTHCDLADAT';'RGTACDMADAA']; p1 = seqprofile(ma1,'gaps','all','counts',true); p2 = seqprofile(ma2,'counts',true);
Merge two profiles into a single one by aligning them.
p = profalign(p1,p2); seqlogo(p)
Use the output pointers to generate the multiple alignment.
[p, h1, h2] = profalign(p1,p2);
ma = repmat('-',5,12);
ma(1:3,h1) = ma1;
ma(4:5,h2) = ma2;
disp(ma)
Increase the gap penalty before cysteine in the second profile.
gapVec = 10 + [p2(aa2int('C'),:) 0] * 10 p3 = profalign(p1,p2,'gapopen',{10,gapVec}); seqlogo(p3)
Add a new sequence to a profile without inserting new gaps into the profile.
gapVec = [0 inf(1,11) 0]; p4 = profalign(p3,seqprofile('PLHFMSVLWDVQQWP'),... 'gapopen',{gapVec,10}); seqlogo(p4)