Working with very big data faster ?
이전 댓글 표시
Dear Matlab users,
I have to deal with very big data(Point clouds generally more than 30 000 000 points) using Matlab. I can read ascii data using "textscan" function. After reading, I need to detect invalid data (points with 0,0,0 coordinates) and then I need to do some mathematical operations on each point or each line in the data. In my way, first I read data with "testscan" and then I assign this data to a matrix. Secondly, I use for loops for detecting invalid points and doing some mathematical operations on each point or line in the data. A sample of my code is shown as below. Is there a way of avoiding for loops or what is the best way of speeding up this computation? I am looking forward to hearing from you
fileID = fopen('some ascii data with more than 10 000 000 points');
original_data = textscan(fileID,'%f %f %f %f %f %f %f', 'delimiter',' ');
fclose(fileID);
column = original_data{1}(1);
row = original_data{1}(2);
t_matrix = [original_data{1}(7) original_data{2}(7) original_data{3}(7) original_data{4}(7)
original_data{1}(8) original_data{2}(8) original_data{3}(8) original_data{4}(8)
original_data{1}(9) original_data{2}(9) original_data{3}(9) original_data{4}(9)
original_data{1}(10) original_data{2}(10) original_data{3}(10) original_data{4}(10)];
coordinate_list(:,1) = original_data{1}(11:length(original_data{1}));
coordinate_list(:,2) = original_data{2}(11:length(original_data{2}));
coordinate_list(:,3) = original_data{3}(11:length(original_data{3}));
coordinate_list(:,4) = 0;
coordinate_list(:,5) = original_data{4}(11:length(original_data{4}));
%detect invalid points and transform each point with t_matrix
for i = 1:length(coordinate_list)
if coordinate_list(i,1) == 0 && coordinate_list(i,2) == 0 && coordinate_list(i,3) == 0
transformed_list(i,:) = NaN;
else
%transformed_list(i,:) = coordinate_list(i,:)*t_matrix;
transformed_list((i:i),(1:4)) = coordinate_list((i:i),(1:4))*t_matrix;
transformed_list(i,5) = coordinate_list(i,5);
end
i
end
댓글 수: 6
KSSV
2016년 9월 26일
You have not initialized transformed_list()...this makes codes slow. You must considering initializing.
Adam
2016년 9월 26일
Have you run the profiler on your code?
doc profile
You should always do this before making any attempt at speeding up your code, otherwise how do you know which part is taking the longest time? Assumptions are generally a very bad idea!
mustafa ozendi
2016년 9월 26일
KSSV
2016년 9월 26일
does your text file have any texts inside? or only numbers? Can you attach a sample of the text file?
mustafa ozendi
2016년 9월 26일
per isakson
2016년 9월 26일
편집: per isakson
2016년 9월 26일
Use
textscan( ..., 'CollectOutput',true )
Neither of your two samples matches
textscan(fileID,'%f %f %f %f %f %f %f', 'delimiter',' ');
답변 (1개)
To find whether (x,y,z) are zeros, you need not to run a loop. You can find in single stretch.
id = sum(coordinate_list,2)==0 ; % this output will be logical
idx = find(sum(coordinate_list,2)==0) ; % this output will give positions where are zeros
You can achieve all the loop things with out using for loop.
카테고리
도움말 센터 및 File Exchange에서 Large Files and Big Data에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!