Multi-batch csv processing error

Question

0 개 추천

trial.zip

I have a series of csv files where each file containts data that I want to extract located at data(1, [1 3]) (please see attached file). The data will then be exported to a one csv file named test.csv with there conrresonsding label (filename) at the first column, as seen below.

What I got so far is this code.

clear all; clc
path = '/Applications/trial';
list = dir(fullfile(path, '*.csv'));
for i = 1:numel(list)
    F = fullfile(path, list(i).name);
    data = readtable (F);
    T = data(1, [1 3]);
    energy = table2array(T);
    dlmwrite('test.csv',energy,'-append')
    
end

However, when I tried exporting, it produces a series of number that is contained in each cell.

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

활동을 팔로우하려면 로그인

Answer 1

Jorg Woehl 2021년 3월 16일

편집: Jorg Woehl 2021년 3월 16일

MATLAB Online에서 열기

0 개 추천

Hi Denxybel, the main problem lies in the readtable statement. Althought the documentation states that "by default, readtable creates variables that have data types that are appropriate for the data values detected in each column of the input file", this does not work here:

data = readtable('DoseAtDemoSphereR1Q1C1.csv')
data =
  1×14 table
               x_                        Results                xFor                scorer                DoseAtDemoSphereR1Q1C1                Var6                          Var7                 Var8       Var9    Var10    Var11    Var12    Var13    Var14
    ________________________    __________________________    ________    __________________________    __________________________    _________________________    __________________________    _________    ____    _____    _____    _____    _____    _____
    {'0.02753477916121483,'}    {'2.753477962114824e-10,'}    {'170,'}    {'2.866090017161038e-05,'}    {'2.866090045821939e-13,'}    {'5.35358762496883e-07,'}    {'4.827696467064431e-07,'}    0.0036571    NaN      NaN      NaN      NaN      NaN      NaN 

As you can see, most variables are read in as character vectors, especially the first and third column that you are interested in. Also, the comma has not been properly recognized as the delimiter in the csv file; instead, it appears as part of the data character vectors.

You will therefore have to provide more details to readtable so that it can parse the input file correctly. There are multiple ways for doing this, for example:

data = readtable('DoseAtDemoSphereR1Q1C1.csv', 'Format', '%f', 'Delimiter', ',')
data =
  8×1 table
       Var1   
    __________
      0.027535
    2.7535e-10
           170
    2.8661e-05
    2.8661e-13
    5.3536e-07
    4.8277e-07
     0.0036571

In addition (although not strictly necessary here), you could supply details about the header comments by adding 'CommentStyle','#' to the list of arguments.

Your output is now a 8-by-1 table, so to reduce it to a 2-by-1 table containing only the first and third value you would use

data([1 3],1)

Instead of converting your table to an array and then use dlmwrite (which is no longer recommended), you can simply write the table directly to a file using writetable. I would also suggest to first collect all data inside the loop and write them to a file all at once after the loop; this should be much more efficient than appending data to a file inside the loop.

Your final code could then look something like this:

path = '/Applications/trial';
list = dir(fullfile(path, '*.csv'));
% preallocate table and define variables
T = table('Size', [numel(list) 3],...
    'VariableTypes', {'string','double','double'},...
    'VariableNames', {'Name','Energy','Number of Particles'});
for i = 1:numel(list)
    F = fullfile(path, list(i).name);
    data = readtable(F, 'Format', '%f', 'Delimiter', ',');
    T{i,1} = string(F);
    T{i,2} = data{1,1};
    T{i,3} = data{3,1};
end
% write data to file
writetable(T, 'test.csv');

댓글 수: 1
이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

Denxybel Montinola 2021년 3월 17일

MATLAB Online에서 열기

Thank you very much for the comprehensive explanation @Jorg Woehl! It is really helpful.

I have actually figured out how to spit the data and write it in a matrix form yesterday. This is what I got

clear all; clc
path = '/Applications/trial';
list = dir(fullfile(path, '*.csv'));
for i = 1:numel(list)
    F = fullfile(path, list(i).name);
    rawdata = readtable([F], 'Delimiter', 'comma', 'readvariable',false);
    data = table2array(rawdata(1,[1 3]));
    writematrix(data,'test.csv','WriteMode','append'); 
end

but with your help, everthing works great!

P.S. Under the name column, I think this should be this in order to write the filename not the whole path. but no biggies, it works really well!

T{i,1}=string(list(i).name);

댓글을 달려면 로그인하십시오.

Multi-batch csv processing error

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 1
이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

추가 답변 (0개)

카테고리

제품

릴리스

태그

Community Treasure Hunt

Multi-batch csv processing error

댓글 수: 0 이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 1 이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

추가 답변 (0개)

카테고리

제품

릴리스

태그

참고 항목

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글 수: 1
이전 댓글 -1개 표시 이전 댓글 -1개 숨기기