Dear All, I wrote this code to analyze my text data, each time I run this code it takes exactly 22 minutes to finish ! which is very long time, since I am new on MATLAB I still don't know the tricks to minimize running time, will any body please give me some ideas regarding this code? Thank you MATLAB Community
clear;
clc;
directory=dir('*.Y07');
count=0;
for K = 1 : length(directory)
filename = directory(K).name;
fileID = fopen(filename,'r');
formatSpec = '%s';
A_cell = textscan(fileID,formatSpec);
A=char(A_cell{1,1}{:,:});
A(find(isnan(A)))=0;
[rows,columns]=size(A);
if columns~=105
ArrayTemp=zeros(rows,105);
ArrayTemp(1:rows,1:columns)=A;
A=ArrayTemp;
A=char(A);
A(isspace(A)) = '0';
end
x1=filename;
xtr=strcat('C:\Users\maa285\Desktop\New folder (2)\',x1);
fid = fopen( xtr, 'wt' );
Record_Type = A(:,1:1);
FibsCode = A(:,2:3);
StationID = A(:,4:9);
Direction_Of_Travel = A(:,10:10);
Lane_Of_Travel = A(:,11:11);
Year_of_Data = A(:,12:13);
Month_of_Data = A(:,14:15);
Day_of_Data = A(:,16:17);
Hour_of_Data = A(:,18:19);
Vehicle_Class = A(:,20:21);
Open = A(:,22:24);
Total_Weight_of_vehicle = A(:,25:28);
Number_of_axles = A(:,29:30);
A_Axle_Weight = A(:,31:33);
A_B_Axle_spacing = A(:,34:36);
B_Axle_Weight = A(:,37:39);
B_C_Axle_spacing = A(:,40:42);
C_Axle_Weight = A(:,43:45);
C_D_Axle_spacing = A(:,46:48);
D_Axle_Weight = A(:,49:51);
D_E_Axle_spacing = A(:,52:54);
E_Axle_Weight = A(:,55:57);
E_F_Axle_spacing = A(:,58:60);
F_Axle_Weight = A(:,61:63);
F_G_Axle_spacing= A(:,64:66);
G_Axle_Weight = A(:,67:69);
G_H_Axle_spacing = A(:,70:72);
H_Axle_Weight = A(:,73:75);
H_I_Axle_spacing = A(:,76:78);
I_Axle_Weight= A(:,79:81);
I_J_Axle_spacing = A(:,82:84);
J_Axle_Weight = A(:,85:87);
J_K_Axle_spacing= A(:,88:90);
K_Axle_Weight = A(:,91:93);
K_L_Axle_spacing = A(:,94:96);
L_Axle_Weight = A(:,97:99);
L_M_Axle_spacing = A(:,100:102);
M_Axle_Weight = A(:,103:105);
%This is to convert the string to numbers so it can show the output:
Ans_1=str2num(Record_Type);
fprintf(fid,Record_Type);
Ans_2=str2num(FibsCode);
fprintf(fid,FibsCode);
Ans_3=str2num(StationID);
fprintf(fid,StationID);
Ans_4=str2num(Direction_Of_Travel);
fprintf(fid,Direction_Of_Travel);
Ans_5=str2num(Lane_Of_Travel);
fprintf(fid,Lane_Of_Travel);
Ans_6=str2num(Year_of_Data);
fprintf(fid,Year_of_Data);
Ans_7=str2num(Month_of_Data);
fprintf(fid,Month_of_Data);
Ans_8=str2num(Day_of_Data);
fprintf(fid,Day_of_Data);
Ans_9=str2num(Hour_of_Data);
fprintf(fid,Hour_of_Data);
Ans_10=str2num(Vehicle_Class);
fprintf(fid,Vehicle_Class);
Ans_11=str2num(Open);
fprintf(fid,Open);
Ans_12=str2num(Total_Weight_of_vehicle);
fprintf(fid,Total_Weight_of_vehicle);
Ans_13=str2num(Number_of_axles);
fprintf(fid,Number_of_axles);
Ans_14=str2num(A_Axle_Weight);
fprintf(fid,A_Axle_Weight);
Ans_15=str2num(A_B_Axle_spacing);
fprintf(fid,A_B_Axle_spacing);
Ans_16=str2num(B_Axle_Weight);
fprintf(fid,B_Axle_Weight);
Ans_17=str2num(B_C_Axle_spacing);
fprintf(fid,B_C_Axle_spacing);
Ans_18=str2num(C_Axle_Weight);
fprintf(fid,C_Axle_Weight);
Ans_19=str2num(C_D_Axle_spacing);
fprintf(fid,C_D_Axle_spacing);
Ans_20=str2num(D_Axle_Weight);
fprintf(fid,D_Axle_Weight);
Ans_21=str2num(D_E_Axle_spacing);
fprintf(fid,D_E_Axle_spacing);
Ans_22=str2num(E_Axle_Weight);
fprintf(fid,E_Axle_Weight);
Ans_23=str2num(E_F_Axle_spacing);
fprintf(fid,E_F_Axle_spacing);
Ans_24=str2num(F_Axle_Weight);
fprintf(fid,F_Axle_Weight);
Ans_25=str2num(F_G_Axle_spacing);
fprintf(fid,F_G_Axle_spacing);
Ans_26=str2num(G_Axle_Weight);
fprintf(fid,G_Axle_Weight);
Ans_27=str2num(G_H_Axle_spacing);
fprintf(fid,G_H_Axle_spacing);
Ans_28=str2num(H_Axle_Weight);
fprintf(fid,H_Axle_Weight);
Ans_29=str2num(H_I_Axle_spacing);
fprintf(fid,H_I_Axle_spacing);
Ans_30=str2num(I_Axle_Weight);
fprintf(fid,I_Axle_Weight);
Ans_31=str2num(I_J_Axle_spacing);
fprintf(fid,I_J_Axle_spacing);
Ans_32=str2num(J_Axle_Weight);
fprintf(fid,J_Axle_Weight);
Ans_33=str2num(J_K_Axle_spacing);
fprintf(fid,J_K_Axle_spacing);
Ans_34=str2num(K_Axle_Weight);
fprintf(fid,K_Axle_Weight);
Ans_35=str2num(K_L_Axle_spacing);
fprintf(fid,K_L_Axle_spacing);
Ans_36=str2num(L_Axle_Weight);
fprintf(fid,L_Axle_Weight);
Ans_37=str2num(L_M_Axle_spacing);
fprintf(fid,L_M_Axle_spacing);
Ans_38=str2num(M_Axle_Weight);
fprintf(fid,M_Axle_Weight);
%to establish a new arrays for the date:
monthvar = Month_of_Data;
dayvar = Day_of_Data;
yearvar = Year_of_Data;
datevar = strcat(num2str(monthvar),'/',num2str(dayvar),'/',num2str(yearvar));
% DayNumber = weekday(datevar);
[DayNumber,DayName] = weekday(datevar);
%%to return the separated vectors into one new matrix with usable data:
all=[Ans_2,Ans_2,Ans_3,Ans_4,Ans_5,Ans_6,Ans_7,Ans_8,Ans_9,Ans_10,DayNumber,Ans_12,Ans_13,Ans_14,Ans_15,Ans_16,Ans_17,Ans_18,Ans_19,Ans_20,Ans_21,Ans_22,Ans_23,Ans_24,Ans_25,Ans_26,Ans_27,Ans_28,Ans_29,Ans_30,Ans_31,Ans_32,Ans_33,Ans_34,Ans_35,Ans_36,Ans_37,Ans_38];
% all(isspace(all)) = '0';
%now select each class data out of the original data A.
VehClass = Ans_10;
Class9_data = all(VehClass == 9, :);
count=count+1;
end

댓글 수: 1

MAHMOUD ALZIOUD
MAHMOUD ALZIOUD 2017년 11월 9일
편집: MAHMOUD ALZIOUD 2017년 11월 9일
Note, this code is used to read multiple files, that is why I use the fprintf command, I have K numbers of files.

댓글을 달려면 로그인하십시오.

 채택된 답변

Walter Roberson
Walter Roberson 2017년 11월 9일

0 개 추천

You are using fprintf() with no format specifier, passing in what looks to be character strings. The character strings are going to be interpreted as being format strings. That will mostly result in the character strings being output. However, they will be output with no spaces between them and no delimiter.
You can improve performance by calling fprintf() fewer times, such as
fprintf(fid, '%s %s %s %s %s', H_Axle_Weight, H_I_Axle_Weight, I_Axle_Weight, I_J_Axle_Weight)

댓글 수: 1

Thank you very much Mr Walter, I will modify my code and run it now.

댓글을 달려면 로그인하십시오.

추가 답변 (2개)

Gregory McFadden
Gregory McFadden 2017년 11월 9일

0 개 추천

try running your code after issuing
profile on
and then when it is done use
profile viewer
that will tell you exactly what line(s) are consuming all the time in the code, then we can focus on those specific long compute time issues

댓글 수: 5

so you mean I type profile on then run the code then type profile viewer
I did it and i am waiting for the run to finish.
I tried it and it gave me this, what does this mean?
You should click on the "strcat" part. The profiler will show you a summary of where strcat was called..
I would suggest that you should be considering using sprintf() instead of all of those num2str() and strcat()
Thank you very much for your advice, I will try it now
this saved me 5 minutes, instead of 22 minutes per run, now it is 17. so it worked. thank you for this

댓글을 달려면 로그인하십시오.

Jan
Jan 2017년 11월 10일

0 개 추천

After
A=char(A_cell{1,1}{:,:});
A is a CHAR. Then it cannot contain NaNs, because this can happen for DOUBLE and SINGLE only. The line:
A(find(isnan(A)))=0;
is useless in consequence.
The numbered variables "Ans_1" look cruel. It is a horror to debug this.

카테고리

도움말 센터File Exchange에서 Large Files and Big Data에 대해 자세히 알아보기

질문:

2017년 11월 9일

답변:

Jan
2017년 11월 10일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by