필터 지우기
필터 지우기

With extractHTMLtext i have harvested a news article. How can I write paragraph-long blocks to a text file

조회 수: 1 (최근 30일)
The text analysis funcction created a clean, ASCII file out of a very complext newspaper article using the following code (which worked wel!):
url = "https://www.staradvertiser.com/2021/08/22/editorial/on-politics/on-politics-gov-david-iges-handling-of-covid-19-hobbled-by-indecision-inadequate-staffers/";
code = webread(url);
str = extractHTMLText(code)
Each paragraph became a line of text. How can I write these to an ascii file for import to a text processing program? One paragraph per line of output file (txt or xlsx) would be best.

답변 (1개)

Vatsal
Vatsal 2024년 2월 21일
Hi,
To output the extracted text to an ASCII file, formatting each paragraph as a separate line, the text must first be divided into paragraphs. This can be achieved in MATLAB by utilizing the "split" function, which divides a string into a cell array of strings using designated delimiters.
Here is the modified code to write each paragraph to a text file:
url = "https://www.staradvertiser.com/2021/08/22/editorial/on-politics/on-politics-gov-david-iges-handling-of-covid-19-hobbled-by-indecision-inadequate-staffers/";
code = webread(url);
str = extractHTMLText(code)
str_split = split(str, '\n'); % Split the string into paragraphs
fileID = fopen('output.txt','w'); % Open a file named 'output.txt'. Change it as per your requirement.
for i = 1:numel(str_split)
fprintf(fileID,'%s\n',str_split{i}); % Write each paragraph on a new line
end
fclose(fileID); % Don't forget to close the file after you're done
I hope this helps!

카테고리

Help CenterFile Exchange에서 Environment and Settings에 대해 자세히 알아보기

제품


릴리스

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by