WebRead CSV Missing Rows show in browser

조회 수: 1 (최근 30일)
Ryan Klingert
Ryan Klingert 2020년 1월 30일
답변: Guillaume 2020년 2월 7일
Intro:
I want to start by saying I am pretty new to webscraping and while I have had some sucess working with HTML and the string editing functions I havent been able to figure out downloadin a table.
Backgroung:
The overal background is that I am working on a project to build a roster picking model for daily fantasey sports. There are several websites, including the one that i am using, which have relativly acurate projection for each players projected daily points. In order to backtest my model I need to collect projections from past season and so am trying to scrape this site.
Question:
This site displays a table of historical results, it also has a link to download these results as a CSV: https://rotogrinders.com/projected-stats/nhl-skater.csv?site=draftkings&date=2019-12-12
The issue is that when visiting that link in a web browser you get a csb with 100's of rows, matching the html page, however when you try to use Webread to systematicly download and save the CSV you only get a slect few of those rows. Code is posted below.
any help would be great!!!!!!
options = weboptions('Timeout',15);
date = datetime(2019,12,12)
useDay = char(string(day(date)));
if size(useDay,2) == 1
useDay = '0' + string(useDay);
end
useMonth = char(string(month(date)));
if size(useMonth,2) == 1
useMonth = '0' + string(useMonth);
end
html = webread('https://rotogrinders.com/projected-stats/nhl-skater.csv?site=draftkings&date=' + string(year(date)) + '-' + string(useMonth) + '-' + string(useDay) ,options);
  댓글 수: 1
Rik
Rik 2020년 1월 30일
I suspect the title might have triggered the spam filter. A word of advice: remove all non-Matlab relevant content. The point of your question is that webread doesn't download the same csv as you see in your browser, so that is the only relevant part for the question title.

댓글을 달려면 로그인하십시오.

답변 (1개)

Guillaume
Guillaume 2020년 2월 7일
If I try to download the file from your link, using a web browser, I only get a few rows. Considering that when you visit the main webpage you get a prominent banner telling you you can only see rosters when a premium user, the problem seems clear: You need to be logged in order to download the full file.
Modifying your weboptions to specify username/password should work (assuming the website is designed properly):
options = weboptions('Timeout',15, 'Username', '??', 'Password', '***');
%rest of code as is...

카테고리

Help CenterFile Exchange에서 Downloads에 대해 자세히 알아보기

태그

제품

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by