I am trying to web scraping using this tutorial : https://medium.com/@roymilaniitd/web-scraping-to-extract-news-using-matlab-dd78b954684 , But when I test the following code:
html = webread('https://www.indiatoday.in/top-stories');
list = extractBetween(html,'<h3 class=”” title=','</a></h3><p>');
list2=extractAfter(list,'<a href="');
list3 = extractAfter(list2,'">');
I get the contents in a html char variable, but three 0*1 cells list1, list2, list3!
Why this happens?

답변 (2개)

Jan
Jan 2019년 1월 28일
편집: Jan 2019년 1월 28일

0 개 추천

You are searching for:
'<h3 class=”” title='
% ^^
I'm sure, you mean:
'<h3 class="" title='
with standard double quotes ".
The author of this page seems to use a tool like MS Word to create webpages and let the automatic replacement insert smart quotes. This is a very bad idea when posting code in the internet.
Milan Roy
Milan Roy 2019년 1월 29일

0 개 추천

Yes, just use the standard " " instead of the formatted double quote. It should work fine.

질문:

2019년 1월 28일

답변:

2019년 1월 29일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by