symfony - Goutte Scrape Login to https Secure Website -
so i'm trying use goutte login https website following error:
curl error 60: ssl certificate problem: unable local issuer certificate 500 internal server error - requestexception 1 linked exception: ringexception
and code creator of goutte says use:
use goutte\client; $client = new client(); $crawler = $client->request('get', 'http://github.com/'); $crawler = $client->click($crawler->selectlink('sign in')->link()); $form = $crawler->selectbutton('sign in')->form(); $crawler = $client->submit($form, array('login' => 'fabpot', 'password' => 'xxxxxx')); $crawler->filter('.flash-error')->each(function ($node) { print $node->text()."\n"; });
or here's come code symfony recommends:
use goutte\client; // make real request external site $client = new client(); $crawler = $client->request('get', 'https://github.com/login'); // select form , fill in values $form = $crawler->selectbutton('log in')->form(); $form['login'] = 'symfonyfan'; $form['password'] = 'anypass'; // submit form $crawler = $client->submit($form);
the thing neither of them work, error posted above. can, log in using code written in past question i've asked: curl scrape parse/find specific content
i want use symfony/goutte login in, scraping data need easier. or suggestions please? thanks!
adding following code fixes error (curl configuration):
// make real request external site $client = new client(); $client->getclient()->setdefaultoption('config/curl/'.curlopt_ssl_verifyhost, false); $client->getclient()->setdefaultoption('config/curl/'.curlopt_ssl_verifypeer, false); $crawler = $client->request('get', 'https://github.com/login');
but error occurs:
the current node list empty. 500 internal server error - invalidargumentexception
once again, i'm using goutte symfony , default code test task, such logging https github.
the fix previous error node list empty
github login page button says "sign in" , not submit or login on button. unfortunately, goutte api isn't clear if $form = $crawler->selectbutton('sign in')->form();
refers html name
attribute or actual plain text of button. it's plain text; confusing. after more research of poorly documented api, ended following code works:
// make real request external site $client = new client(); $client->getclient()->setdefaultoption('config/curl/'.curlopt_ssl_verifyhost, false); $client->getclient()->setdefaultoption('config/curl/'.curlopt_ssl_verifypeer, false); $crawler = $client->request('get', 'https://github.com/login'); // select form , fill in values $form = $crawler->selectbutton('sign in')->form(); $form['login'] = 'symfonyfan'; $form['password'] = 'anypass'; // submit form $crawler = $client->submit($form); echo $crawler->html();
Comments
Post a Comment