필터 지우기
필터 지우기

How can I split a word? (Ther is a special case)

조회 수: 2 (최근 30일)
Kerem Kayan
Kerem Kayan 2021년 4월 3일
편집: Brahmadev 2024년 2월 14일
Hello everyone, I'm trying to create a dataset for my project. But I'm pulling the data from the internet. I need the HTML tag "ul". There is no space between words in the two "ul" drawn one under the other. How can I split them automatically?
  댓글 수: 2
David Hill
David Hill 2021년 4월 3일
I do not understand your question.
DGM
DGM 2021년 4월 3일
We would need to know how you're reading the data in to be able to tell you how to improve it. Trying to fix the merged words after the fact is going to be a lot more problematic.

댓글을 달려면 로그인하십시오.

답변 (1개)

Brahmadev
Brahmadev 2024년 2월 14일
편집: Brahmadev 2024년 2월 14일
As per my understanding of your question, you would like to split different unordered list elements using MATLAB from HTML data. I am assuming that you have the data stored in a MATLAB variable. If not, you can read the webpage using the "webread" function. See example below:
url = "https://www.mathworks.com/help/matlab/ref/webread.html"; % Edit the URL with your webpage
HTMLcode = webread(url);
Now, when we have the HTMLcode as a a character array, we can use the following approach to split all the elements with a <ul> tag consecutively . Note that I have taken a smaller example for better understanding.
HTMLcode = '<ul><li>First list item</li></ul><ul><li>Second list item</li></ul>';
% Regular expression to match </ul> followed immediately by <ul>
pattern = '(</ul>)(<ul>)';
% Insert a delimiter between the two tags
delimiter = '#SPLIT#';
html_with_delimiter = regexprep(HTMLcode, pattern, ['$1', delimiter, '$2']);
% Split the string at the delimiter
split_html = strsplit(html_with_delimiter, delimiter);
% Display the separate <ul> elements
for i = 1:length(split_html)
disp(split_html{i});
end
You can refer to the following MathWorks documentations for mmore information on the used function:
  1. webread: https://www.mathworks.com/help/matlab/ref/webread.html
  2. regexprep: https://www.mathworks.com/help/matlab/ref/regexprep.html
  3. strsplit: https://www.mathworks.com/help/matlab/ref/strsplit.html
Hope this helps in better understanding!

카테고리

Help CenterFile Exchange에서 Data Import and Export에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by