How to parse html text having singleton tag?
이전 댓글 표시
I have a htmldata stored in the form of char array
<html>
<head>
</head>
<body>
<div class="header">
HEADER1
</div>
<div class="content">
<br>my data
</div>
</body>
</html>
I want to retreive data between tags for which i tried some thing like
import javax.xml.parsers.DocumentBuilderFactory
dbf=javax.xml.parsers.DocumentBuilderFactory.newInstance();
builder = dbf.newDocumentBuilder();
is=org.xml.sax.InputSource(java.io.StringReader(htmldata));
dom=builder.parse(is);
The above code works fine when there are no singleton tags. but it throws error when I add singleton tags [Fatal Error] :36:7: The element type "br" must be terminated by the matching end-tag "</br>".
even xmlread throwns same error
>> xmlread(is)
[Fatal Error] :The element type "br" must be terminated by the matching end-tag "</br>".
Error using xmlread (line 106)
Java exception occurred:
org.xml.sax.SAXParseException; The element type "br" must be terminated by the matching end-tag
"</br>".
at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
is there any workaround for this ?
댓글 수: 1
Purav Panchal
2022년 8월 31일
Hey, did you find any solution?
채택된 답변
추가 답변 (0개)
카테고리
도움말 센터 및 File Exchange에서 Historical Contests에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!