How can i compute Amino Acid composition for my protein sequence data?

조회 수: 9 (최근 30일)
Nedz
Nedz 2020년 4월 23일
답변: Tim DeFreitas 2020년 4월 23일
How can i get/compute the amino composition for my protein sequences inorder to further use it to train my SVM classifier?
for example if, i have the following sequence as one of my sequence sample:
'AEYDDSLIDEEEDDEDLDEFKPIVQYDNFQDEENIGIYKELEDLIEKNE'

채택된 답변

Tommy
Tommy 2020년 4월 23일
편집: Tommy 2020년 4월 23일
allAA = sort('ARNDCQEGHILKMFPSTWYV');
seq = 'AEYDDSLIDEEEDDEDLDEFKPIVQYDNFQDEENIGIYKELEDLIEKNE';
counts = histc(seq, allAA);
freq = counts/numel(seq);
for aa = allAA
fprintf('%c: %d/%d (%.4f%%)\n', aa, counts(allAA==aa), numel(seq), freq(allAA==aa));
end
%{
prints:
A: 1/49 (0.0204%)
C: 0/49 (0.0000%)
D: 10/49 (0.2041%)
E: 12/49 (0.2449%)
F: 2/49 (0.0408%)
G: 1/49 (0.0204%)
H: 0/49 (0.0000%)
I: 5/49 (0.1020%)
K: 3/49 (0.0612%)
L: 4/49 (0.0816%)
M: 0/49 (0.0000%)
N: 3/49 (0.0612%)
P: 1/49 (0.0204%)
Q: 2/49 (0.0408%)
R: 0/49 (0.0000%)
S: 1/49 (0.0204%)
T: 0/49 (0.0000%)
V: 1/49 (0.0204%)
W: 0/49 (0.0000%)
Y: 3/49 (0.0612%)
%}

추가 답변 (1개)

Tim DeFreitas
Tim DeFreitas 2020년 4월 23일
If you have the Bioinformatics Toolbox, there's also the AACOUNT function:https://www.mathworks.com/help/bioinfo/ref/aacount.html
seq = 'AEYDDSLIDEEEDDEDLDEFKPIVQYDNFQDEENIGIYKELEDLIEKNE';
counts = aacount(seq)
% Optional: plotting included
aacount(seq, 'chart', 'bar')

카테고리

Help CenterFile Exchange에서 Data Import and Export에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by