URLREAD2 - User Agent and Cookies

조회 수: 5 (최근 30일)
Dan
Dan 2016년 1월 26일
답변: Dan 2018년 9월 19일
I'm at a loss at how to get this sample code working, and I was hoping if someone is able to review and assess my assumptions as to what mat be wrong.
Problem: I would like to use Matlab to access a webpage that is protected by a login screen. I am able to use wget and it works fine, however as we know, wget does not load ajax/javascript etc imbedded within the page. Therefore, I have turned to using urlread2 function available from the Matlab exchange. Hereafter, all examples are based on this function.
Example: I am trying to login to a financial website, however upon testing with other sites I get the same error. Therefore, for my example I am going to use fitbit.com. To mimimic the behaviour of a browser, I pass the following combined headers into urlread2 (I have split the code to make it easier to see what I'm doing):
value = 'https://www.fitbit.com';
header = http_createHeader('Host',value);
value = 'keep-alive';
header2 = http_createHeader('Connection',value);
value = '278';
header3 = http_createHeader('Content-Length',value);
value = 'max-age=0';
header4 = http_createHeader('Cache-Control',value);
value = 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8';
header5 = http_createHeader('Accept',value);
value = 'https://www.fitbit.com';
header6 = http_createHeader('Origin',value);
value = 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36';
header7 = http_createHeader('User-Agent',value);
value = 'application/x-www-form-urlencoded';
header8 = http_createHeader('Content-Type',value);
value = 'https://www.fitbit.com/login';
header9 = http_createHeader('Referer',value);
value = 'gzip, deflate';
header10 = http_createHeader('Accept-Encoding',value);
value = 'en-US,en;q=0.8';
header11 = http_createHeader('Accept-Language',value);
%Generate a combined header as required by urlread2
combined_header = [header header2 header3 header4 header5 header6 header7 header8 header9 header10 header11];
With the header information defined, I generate the query string required (this is for the post operation):
queryString = 'email=myemail&password=mypassword&login=Log+In';
Finally, bring it all together for the urlread2 function:
[output,extras] = urlread2('https://www.fitbit.com/login','post',queryString,combined_header);
The following response is embedded within the HTML:
'The owner of this website (www.fitbit.com) has banned your access based on your browser''s signature (2659bb18cf10354e-ua21).'
Possible problem 1:
It may well be that I'm passing in the header incorrectly, however when I mimic the headers via FireFox the page works correctly. Any advice on this would be greatly appreciated.
Possible problem 2:
I think the problem may be down to cookies, with the urlread2 (nor any other function in Matlab) supporting cookies - if this is the case does anyone have any suggestions on how to maybe tackle this?
Thanks,
Dan
  댓글 수: 2
Yingyun Ai
Yingyun Ai 2017년 6월 6일
Can I ask if you solve the question after? thank you. is there any possible way to do urlread2 with basic authentication?
Jyotsana Walia
Jyotsana Walia 2018년 9월 17일
Hi, were you able to figure out passing basic auth to urlread2?

댓글을 달려면 로그인하십시오.

답변 (1개)

Dan
Dan 2018년 9월 19일
Sadly I did not solve this problem. In the end I used a custom python script.

카테고리

Help CenterFile Exchange에서 Web Services에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by