Acquiring very large nucleotide sequence in matlab workspace.

Question

Shiwani 2012년 5월 6일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/37550-acquiring-very-large-nucleotide-sequence-in-matlab-workspace

I have been facing the following problems while trying to import very large amt of data in the form of a sequence(~MBases). 1.Using getgenbank does not load the sequence information and CDS headers into Matlab so Iam unable to access some sequences.(Matlab version r2009a) 2. If i use seqtool for a very large sequence eg 'NC_00091', Matlab stops responding. 3. If somehow I am able to access very very large sequences into workspace for some of the organisms using genbank or seqtool, while trying to run the code, matlab exits automatically. And sometimes it shows an error saying data is too large. 4. Do I need to use a bioindexed file for this purpose? Or am I missing out on the memory management part?

댓글 수: 2
없음 표시없음 숨기기

the cyclist 2012년 5월 6일

Can you give a sense of how large "very large" is? How many numbers would you need to store, and what is the nature of those numbers (e.g. are they small integers, or double-precision, etc)?

Shiwani 2012년 5월 6일

The sequence is an array of the order 1x4639675 nucleotide bases and requires 9279350bytes of mem and data is of type char (nucleotide sequences AGCT...).

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.