how do I extract subsets from a vector

Hello,
I have a vector of the size [70397,2]. It has a title of 2 rows and some text 4 rows between the each data set (2x1801 each). This means a total of 39 sets.
I neet to extract the second column set into a matrix of [39,1801] size.
What would be the most efficient way to do it (I have 100 data sets similar to this one to process)
Thank you

댓글 수: 3

Dyuman Joshi
Dyuman Joshi 2023년 1월 17일
It's not clear what your data is like. It would be helpful if you attach the data or give a sample of what your data looks like.
Robert Jones
Robert Jones 2023년 1월 17일
here are some excerpts from the data file
file starts below. line 3 is an empty line. 1801 is the numberof lines per data set needed to be extracted ==========
# TICRA DATA EXPORT
# 2023-01-17T06:32:48
# Curve: 23.7 GHz → E_lhc → 0.0 deg [Imaginary]
# From : imported_working_1/ff_sphcut_C_feed_sub_main_allpoints_CP.cut
1801
0.00E+00 -6.54E-09
1.00E-01 -2.00E-02
2.00E-01 -7.96E-02
3.00E-01 -1.78E-01
4.00E-01 -3.15E-01
5.00E-01 -4.87E-01
6.00E-01 -6.94E-01
7.00E-01 -9.32E-01
8.00E-01 -1.20E+00
9.00E-01 -1.49E+00
1.00E+00 -1.80E+00
1.10E+00 -2.14E+00
1.20E+00 -2.48E+00
1.30E+00 -2.84E+00
1.40E+00 -3.19E+00
1.50E+00 -3.55E+00
1.60E+00 -3.91E+00
1.70E+00 -4.25E+00
1.80E+00 -4.58E+00
..
...
# Curve: 25.7 GHz → E_lhc → 0.0 deg [Imaginary]
# From : imported_working_1/ff_sphcut_C_feed_sub_main_allpoints_CP.cut
1801
0.00E+00 -6.54E-09
1.00E-01 -2.00E-02
2.00E-01 -7.96E-02
3.00E-01 -1.78E-01
...
...
# Curve: 23.7 GHz → E_lhc → 0.0 deg [Imaginary]
# From : imported_working_1/ff_sphcut_C_feed_sub_main_allpoints_CP.cut
1801
0.00E+00 -6.54E-09
1.00E-01 -2.00E-02
2.00E-01 -7.96E-02
3.00E-01 -1.78E-01
...
...
dpb
dpb 2023년 1월 17일
편집: dpb 2023년 1월 17일
Attach the file as a file with the paperclip instead of pasting text...you could excerpt only a few sets, whether there are 2 or 2000 sets won't matter for testing.

댓글을 달려면 로그인하십시오.

답변 (1개)

dpb
dpb 2023년 1월 17일
편집: dpb 2023년 1월 17일

0 개 추천

One way just using the form of the file...
S=readlines('yourfile.txt'); % import as string array
ixFrom=find(startsWith(S,"# From:")); % index to beginning each section
sizeSection=str2double(S(ixFrom(1)+1)); % size of each group -- all must be same per Q? spec
isFrom=ixFrom+2; % adjust start of offset from header, count records
for i=1:numel(isFrom)
i1=isFrom(i); i2=i1+sizeSection-1; % start, stop records each section
tmp=str2double(split(S(i1:i2))); % convert to numeric array
if i==1
A=tmp; % save first set; keep time as well for first set
else
A=[A tmp(:,2)]; % append second column subsequent
end
end
Caution, air code, watch out for mismatched/missing paren's, etc., etc., etc., ...
This will return one extra column over the requested; if you really, really don't want the time(?) data as well at all, then just remove the special case and start off with
A=[];
The above uses dynamic reallocation; for no larger files than these the performance hit won't be bad compared to trying to reallocate. Or, you could use a cell array for the intermediary and then cell2mat
ADDENDUM:
For early releases predating readlines, use
S=string(textread('yourfile.txt','%s','delimiter','\n','whitespace',''));

댓글 수: 4

Robert Jones
Robert Jones 2023년 1월 17일
Thanks, will try
Robert Jones
Robert Jones 2023년 1월 17일
I am using version 9.7.0.1190202 (R2019b)
Using your code, I get this error
'Unrecognized function or variable 'readlines'.'
Didn't notice the release, sorry...in that case take an intermediary step first...
S=string(textread('yourfile.txt', '%s', 'delimiter', '\n','whitespace', ''));
The editor will complain that textread is not recommended, use textscan instead, but while textscan is somewhat more powerful, it's more of a pain to use because it doesn't accept a filename; you first have to open a file handle with fopen and then fclose it when done.
The above brings in the file a cellstr() array, then converts that to the string array to be consistent with remaining existent code posted.

댓글을 달려면 로그인하십시오.

카테고리

도움말 센터File Exchange에서 Text Data Preparation에 대해 자세히 알아보기

제품

릴리스

R2021b

태그

질문:

2023년 1월 17일

편집:

dpb
2023년 1월 17일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by