Extracting certain data from very large text/numeric data

I am trying to extract data from a hoc file which is a combination of text,whitespace,characters, and numbers. I need to be able to find the row index of wherever there occurs the string "section[%d]" where d is an integer, just being able to find the row when I use importdata to a cell array would be good enough, there are upwards of like 40 occurences of the string so I need to find all of them.

댓글 수: 6

Why do you need row numbers? What do you need to extract or do then with these row numbers?
first of all you answered my last question about dealing with this and that was great. I had forgotten to mention that within all of the rows the points are parsed every now and then by a new "section" of points, so I need to be able to identify which point corresponds to the start of a new "section". If you recall the data for the most part is like
pt3dadd(x,y,z,d,e) pt3dadd(x1,y1,z1,d1,e1) } section[2] { pt3dadd(x2,... and so on
  • pt3dadd(x,y,z,d,e)
  • pt3dadd(x1,y1,z1,d1,e1)
  • }
  • section[2] {
  • pt3dadd(x2,
Kelly and I also answered this question from you:
but you gave no feedback, did you need more information?
For the current question, do you need to get the section ID or just to split the file by section and process each section with TEXTSCAN ?
This is not anything to do with calculation. I need to just find where in the text the section id string occurs because that will give me a reference for the first point in that section. The ID number doesn't matter that much since if there is section written 10 times throughout all the points it will be sections(1-10)
My regexp solution is not working for you?

답변 (2개)

Walter Roberson
Walter Roberson 2013년 8월 28일
find(~cellfun(@isempty, regexp(YourCell, 'section\[%\d+\]', 'start')))
Cedric
Cedric 2013년 8월 29일
편집: Cedric 2013년 8월 29일
Based on your comment: one way to tackle that is to split the file according to section headers/footer, so you get blocks that you can process using TEXTSCAN. Example:
content = fileread('myData.txt') ;
blocks = regexp(content, '(}\s*){0,1}section\[\d+\]\s*{|}', 'split') ;
blocks = blocks(2:end-1) ; % Eliminate first empty and last
% (after last '}') blocks.
nBlocks = length(blocks) ;
data = cell(nBlocks, 1) ;
for bId = 1 : nBlocks
data{bId} = textscan(blocks{bId}, 'pt3dadd(%f,%f,%f,%f,%f)') ;
end
and if you don't want data to be a cell array of cell arrays (output of _TEXTSCAN_is a cell array of columns), you can replace the above line in the FOR loop with:
buffer = textscan(blocks{bId}, 'pt3dadd(%f,%f,%f,%f,%f)') ;
data{bId} = [buffer{:}] ;

이 질문은 마감되었습니다.

질문:

2013년 8월 28일

마감:

2021년 8월 20일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by