Spliing text between characters
조회 수: 2 (최근 30일)
이전 댓글 표시
Hello everyone, I have a very long list of questions with their answers. but their pattern is the same. I took a photo ocr. This is the pattern question number, question, three dots and and answer. I have hundrerds of question and answer like that. I want to separete it into question and answer for csv. is it possible to do it?
I used extractbetween for question.
extractbetween(text,'.','...')
"640. En selektif etkili antibiyotik...Penisilinler"
but with very long list of text I am stucked with both question and the answer .
can someone help me or tell is it possible.
댓글 수: 0
답변 (2개)
Mathieu NOE
2021년 2월 1일
hello Ongun
I created a small txt file containing these lines (as example) :
"640. En selektif etkili antibiyotik...Penisilinler"
"641. En selektif etkili antibiyotik...Penisilinler1"
"642. En selektif etkili antibiyotik...Penisilinler2"
"643. En selektif etkili antibiyotik...Penisilinler3"
"644. En selektif etkili antibiyotik...Penisilinler4"
then I tested this code that generated the attached xlsx file
lines = readlines('Document1.txt');
for ci =1:numel(lines)
Qnumber_str = extractBefore(lines(ci),'.'); % extract question number (with double quote)
Qnumber_str = strrep(Qnumber_str, '"', ''); % remove start double quote
Qnumber{ci} = str2num(Qnumber_str); % convert to num
Question{ci} = extractBetween(lines(ci),'.','...'); % extract question
Answer_str = extractAfter(lines(ci),'...'); % extract answer
Answer_str = strrep(Answer_str, '"', ''); % remove end double quote
Answer{ci} = Answer_str;
end
A = [Qnumber' Question' Answer'];
T = array2table(A, 'VariableNames',{'Q number','Question','Answer'})
writetable(T,'test.xlsx')
댓글 수: 0
Walter Roberson
2021년 2월 1일
S = fileread('Document1.txt');
tokens = regexp(S, '^(?<Q>.+)\.{3}(?<A>).+)$', 'lineanchors', 'names', 'dotexceptnewline');
Questions = vertcat(tokens.Q);
Answers = vertcat(tokens.A);
T = table(Questions, Answers);
writetable(T, 'QandA.csv')
댓글 수: 0
참고 항목
카테고리
Help Center 및 File Exchange에서 Characters and Strings에 대해 자세히 알아보기
제품
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!