Creating multiple equally sized matrices from a single numerical cell

조회 수: 4 (최근 30일)
I have a very large text file composed of, in essence one row of numbers. Once I have reorganized the file into a matrix of, for example 500 x 10, I wish to create new matrices every 10 rows and have these save with their own title. A major problem I've experienced with my text file is that it's too big for Matlab, with an out of memory error appearing. This is why I need to separate each matrix into its own set of data. I have already turned a row of 1049600 numbers into a matrix of 1025 x 1024 but now the file is 50 of these sets in one file (1049600 x 50) and I need to create 50 1025 x 1024 matrices.
fid = fopen('test0001.asc');
Cell = textscan( fid, '%d', 'delimiter', ';');
Data = cell2mat(Cell);
N = 1024;
Finish = reshape(Data, N, [])';
The above is the code i had for the smaller files
I considered organizing the data into 51250 rows of 1024 and then creating a while ~ feof loop but this seems like it would require too much code and would thus be too slow. My thought was to have say:
F1 = Data(1:1025, :);
f2 = Data(1026:2051, :);
.....
Any thoughts at all would be much appreciated

채택된 답변

Stephen23
Stephen23 2017년 2월 7일
편집: Stephen23 2017년 2월 10일
Firstly, the idea of generating lots of variables is popular with beginners, but really should be avoided:
Also note that the MATLAB documentation is really good. It is readable, and has articles on lots of topics. Such as this one, which gives a good, robust method for reading a large file into MATLAB:
The core idea of that code is to call textscan in a loop, use textscan's N option to specify how much data to read, and save the data into a cell array. The N option simply defines how many times the format is applied when reading the file.
You should be able to work it out from the examples in the documentation.
As an alternative you might like to read about Tall Arrays, which are a special kind of data type especially for working with very large data files that cannot be read into memory:
EDIT 2017-02-10: add code from comment:
%%Create Fake Datafile %%
% fid = fopen('temp2.txt','wt');
% for k = 1:50,
% fprintf(fid,'%d;',randi([0,255],1,1025*1024));
% end
% fclose(fid);
%%Read DataFile %%
R = 1025;
C = 1024;
opt = {'EndOfLine',';', 'CollectOutput',true};
fid = fopen('temp2.txt','rt');
k = 0;
while ~feof(fid)
Z = textscan(fid,'%d', R*C, opt{:});
if ~isempty(Z{1})
k = k+1;
S = sprintf('temp2_%02d.txt',k);
dlmwrite(S,reshape(Z{1},[],R).',';') % might need to translate
end
end
fclose(fid);
  댓글 수: 12
Aaron Smith
Aaron Smith 2017년 2월 9일
I have the code working fairly well, I just had one thing I'm not too sure about, what does the opt = {'EndOfLine', ';'}; line in your code do? What is its purpose? Thanks again Stephen
Stephen23
Stephen23 2017년 2월 9일
편집: Stephen23 2017년 2월 9일
@Aaron Smith: take a look at these two lines:
opt = {'EndOfLine',';'};
...
Z = textscan(fid,'%d', R*C, opt{:});
one defines the cell array opt, the other provides the elements of opt as inputs to textscan. So it is simply a convenient way to write the inputs without writing them all in one line like this:
Z = textscan(fid,'%d', R*C, 'EndOfLine',';');
For just two arguments it does not make much difference, but sometimes there can be quite a few arguments, and I find the cell array keeps things tidy. It is just a personal choice to do it like that, there is no deeper meaning. You can write the inputs on one line, if you wish to.

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

Guillaume
Guillaume 2017년 2월 8일
편집: Guillaume 2017년 2월 8일
Matlab, since R2014b, has had tools to allow reading in chunks files that are too big to fit in memory. Why not use these? See datastore and in your particular case tabulartextdatastore.
Since R2016b, that support has been made even easier, with the introduction of Tall arrays.

카테고리

Help CenterFile Exchange에서 Text Files에 대해 자세히 알아보기

제품

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by