Parsing and editing txt file line by line

Hello,
How to automatically transform a txt file in this form by removing strings start and end:
Onset,Annotation
+234.3428079,start
+244.1317829,end
+255.1007751,start
+263.0000000,end
to this form:
+234.3428079,+244.1317829
+255.1007751,+263.0000000
Regards

 채택된 답변

Voss
Voss 2024년 6월 19일
filename_in = 'test.txt';
filename_out = 'test_out.txt';
% show the input file's content, for reference
type(filename_in)
Onset,Annotation +234.3428079,start +244.1317829,end +255.1007751,start +263.0000000,end
% read filename_in into a table of size n-by-2 containing strings
opts = detectImportOptions(filename_in);
opts = setvartype(opts,opts.VariableNames,'string');
T = readtable(filename_in,opts);
% reshape the first table variable to size n/2-by-2 appropriately,
% and write it to the output file
writematrix(reshape(T{:,1},2,[]).',filename_out)
% check the output
type(filename_out)
+234.3428079,+244.1317829 +255.1007751,+263.0000000

댓글 수: 4

Elzbieta
Elzbieta 2024년 6월 20일
Thank you a lot Voss! Could I ask how you removed the strings?
How this reshaping works:
writematrix(reshape(T{:,1},2,[]).',filename_out)
Elzbieta
You're welcome!
"Could I ask how you removed the strings?"
I'm not sure what you mean. The table T contains only strings. Do you mean how the first two lines of the file ("Onset,Annotation" and a blank line) were removed? Effectively that was done by readtable since the first line in the file becomes the table's variable names and the lines after the blank become the table's data.
"How this reshaping works:"
T{:,1} is the first column of data in table T, i.e., a string array of size n-by-1. So it's a column vector of strings.
reshape(T{:,1},2,[]) reshapes that column vector of strings into a string matrix of size 2-by-something, such that the first 2 elements of T{:,1} become the first column of the matrix, elements 3 and 4 of T{:,1} become the second column of the matrix, and so on.
Finally that matrix is transposed (.') so that it's of size something-by-2. Now each row has the consecutive pairs of elements of T{:,1}.
Here's a concrete example:
T = table(["1";"2";"3";"4";"5";"6"])
T = 6x1 table
Var1 ____ "1" "2" "3" "4" "5" "6"
T{:,1}
ans = 6x1 string array
"1" "2" "3" "4" "5" "6"
reshape(T{:,1},2,[])
ans = 2x3 string array
"1" "3" "5" "2" "4" "6"
reshape(T{:,1},2,[]).'
ans = 3x2 string array
"1" "2" "3" "4" "5" "6"
Elzbieta
Elzbieta 2024년 7월 8일
Hello,
While processing the following file:
Onset,Annotation
+234.0000000,start
+354.0000000,end
+586.0000000,start
+704.0000000,end
+938.0000000,start
+1056.0000000,end
+1396.0000000,start
+1400.0000000,end
+1810.0000000,start
+1928.0000000,end
+2162.0000000,start
+2282.0000000,end
+2514.0000000,start
+2595.0000000,end
+2612.0000000,start
+2615.0000000,start
+2630.0000000,end
+2865.0000000,start
+2971.0000000,end
+3215.0000000,start
+3332.0000000,end
I am receiving the following error:
Error using reshape
Product of known dimensions, 2, not divisible into total number of elements, 21.
Error in parseTxtAnnotations (line 20)
writematrix(reshape(T{:,1},2,[]).',filename_out)
I am using the following code:
pathInputFolder = 'C:\Users\admin\Documents\MATLAB\PROVIDE\database28_05_2024\mat_files\ECG\trial_ECG_data_txt\trial_ECG_data_edf\edited_3\annotations\';
%filename_in = fullfile(pathInputFolder,[names{i},'_trial_ECG_data_edited_annotations.txt'])
filename_in = 'C:\Users\admin\Documents\MATLAB\PROVIDE\database28_05_2024\mat_files\ECG\trial_ECG_data_txt\trial_ECG_data_edf\edited_3\annotations\Alfredo_trial_ECG_data_edited_annotations.txt';
filename_out = 'C:\Users\admin\Documents\MATLAB\PROVIDE\database28_05_2024\mat_files\ECG\trial_ECG_data_txt\trial_ECG_data_edf\edited_3\annotations\Alfredo_trial_ECG_data_annotations_edited_parsed.txt';
% show the input file's content, for reference
type(filename_in)
% read filename_in into a table of size n-by-2 containing strings\
opts = detectImportOptions(filename_in);
opts = setvartype(opts,opts.VariableNames,'string');
T = readtable(filename_in,opts);
% reshape the first table variable to size n/2-by-2 appropriately,
% and write it to the output file
writematrix(reshape(T{:,1},2,[]).',filename_out)
% check the output
type(filename_out)
Could you tel me where the problem can be found?
Voss
Voss 2024년 7월 8일
The file has two "starts" in a row, at +2612.0000000 and +2615.0000000, with no "end" in between.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

질문:

2024년 6월 19일

댓글:

2024년 7월 8일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by