Concatenate data from processing of imported text files
조회 수: 1 (최근 30일)
이전 댓글 표시
I am extracting text files from a path on my computer, using a for loop to process each text file. This text file undergoes a simple processing like here I am extracting the first two columns of the text file and storing it in K9. Typically K9 has dimensions 14000*2 and varies slightly like 14002*2. How can I concatenate all K9 values in a new matrix say K10? My text files have names A1.txt, A2.txt, A3.txt,...and A22.txt. I also need a new row (first row) where names of these text files are stored so that I understand that the data belongs to this particular text file. I have searched the matlab forum for help in this regard, however could not solve the question. Here is my code
directory = 'D:\PhD\Matlab code';%path from where Iam extracting files
textfiles = fullfile(directory,'*.txt');
dinfo = dir(textfiles);
for K = 1:length(dinfo)
incidentfile = fullfile(directory, dinfo(K).name);
K0 = importdata(incidentfile);
K1 = K0.data;%I extract only the matrix data useful to me not the rest
K9= [K1(:,1) K1(:,2)];%extract the first two columns not the third
end
Please help. Thank you.
Regards
댓글 수: 0
채택된 답변
Max Murphy
2019년 12월 9일
Not the most efficient, but you could do:
directory = 'D:\PhD\Matlab code';%path from where Iam extracting files
textfiles = fullfile(directory,'*.txt');
dinfo = dir(textfiles);
K9 = [];
K10 = [];
for K = 1:length(dinfo)
incidentfile = fullfile(directory, dinfo(K).name);
K0 = importdata(incidentfile);
K1 = K0.data;%I extract only the matrix data useful to me not the rest
% Same as vertcat() function:
K9= [K9; K1(:,1) K1(:,2)];%extract the first two columns not the third
% Cell array of labels:
K10 = [K10; repmat({dinfo(K).name},size(K1,1),1)];
end
You might also look into Matlab table format, since it seems you want to have different data entries where each "row" is a labeled data point.
To make the labeling vector smaller, you may also look into Matlab categorical variables, which can reduce the size if you get all the unique entries of dinfo.name first.
댓글 수: 6
Max Murphy
2019년 12월 10일
I see that they are intermediate steps in a processing algorithm, so it might make sense to do it that way.
Your algorithm is currently something like:
% Iterate on each file in the dataset
for K = 1:length(dinfo)
% Import the data
...
% Do processing on reduced subset of the data that meets criteria
...
% Write the result of processing to variable K7
% --> K7 is overwritten each time the loop runs
end
It might be easier to write a separate processing function that is called once on each loop iteration:
% Data matrix we wish to concatenate
data = [];
for K = 1:length(dinfo)
% Import the data
incidentfile = fullfile(directory, dinfo(K).name);
K0 = importdata(incidentfile);
tmp = [K0.data(:,1),K0.data(:,2)];
% Do processing on reduced subset of the data that meets criteria
data = [data; doProcessing(tmp,T,P,Q)];
% Note that this causes the output of doProcessing to be
% vertically concatenated to the existing matrix, [data].
% This can become inefficient for large datasets, in which case
% it is better to pre-allocate your data matrix or store it in
% some other way where only the relevant chunk is being accessed.
end
And the processing function is
function data_out = doProcessing(data_in,T,P,Q)
K2=data_in(data_in(:,1)<=0.61,:);%Operation1
K3=K2(K2(:,1)>0,:);%Operation2
K4=[K3(:,1) K3(:,2)*T];%Operation3
K5=[K4(:,1) K4(:,2)-K4(1,2)];%Operation4
K6=[K5(:,1)*P K5(:,2)*-1];%Operation5
data_out=[K6(:,1) K6(:,2)*Q];%Operation6
end
Which can be saved as a .m file in the same working directory as your current script, or it can be a nested function within your current function, for example. If you save it as a separate file, it should have the same name as whatever function name you give it (in this example, doProcessing.m).
I would also point out that unless T, P, and Q are also scalars, this may not work depending on the dimensions of your dataset.
추가 답변 (1개)
Jakob B. Nielsen
2019년 12월 9일
You cant join two arrays of different size, but you can use a structure to store the data which will give you almost the same thing.
For example
directory = 'D:\PhD\Matlab code';%path from where Iam extracting files
textfiles = fullfile(directory,'*.txt');
dinfo = dir(textfiles);
for K = 1:length(dinfo)
incidentfile = fullfile(directory, dinfo(K).name);
K0 = importdata(incidentfile);
K1 = K0.data;%I extract only the matrix data useful to me not the rest
K9= [K1(:,1) K1(:,2)];%extract the first two columns not the third
concstruct(K).data=K9;
concstruct(K).name=dinfo(K).name;
end
If you absolutely must join the K9's in the same matrix, you will have to either cut all that exceeds 14000 rows, or alternatively add trailing zeros to any matrix up to the dimension of the larger matrix. You cant have numbers and characters in the same array, so consider array2table of your final array, and then have the variable names in the table be your file name references. But I would still just use the structure way, it gives you - essentially - the same :)
댓글 수: 2
Stephen23
2019년 12월 9일
There is no need to create a new structure, you can just use the strucutre returned by dir:
dinfo(K).data = K9;
참고 항목
카테고리
Help Center 및 File Exchange에서 Environment and Settings에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!