用php抓取google关键词排名
说下思路,利用PHP的curl函数储存cookie,google搜索页面是无法用file_get_connents打开的,必须要完全模拟浏览器才行,百度就不同了,直接用file_get_conntens抓取页面,然后用正则处理下就行了,这里就不列举百度了。
<?php
header("Content-Type: text/html;charset=utf-8");
function ggsearch($url_s, $keyword, $page = 1) {
$enKeyword = urlencode($keyword);
$rsState = false;
$page_num = ($page -1) * 10;
if ($page <= 10) {
$interface = "eth0:" . rand(1, 4); //避免GG封IP
$cookie_file = dirname(__FILE__) . "/temp/google.txt"; //存储cookie值
$url = "http://www.google.com/search?q=$enKeyword&hl=en&prmd=imvns&ei=JPnJTvLFI8HlggeXwbRl&start=$page_num&sa=N";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
//curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);//获取浏览器类型
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 GTB5");
curl_setopt($ch, CURLOPT_INTERFACE, "$interface"); //指定访问IP地址
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file);
$contents = curl_exec($ch);
curl_close($ch);
$match = "!<div\s*id=\"search\">(.*)</div>\s+<\!--z-->!";
相关新闻>>
- 发表评论
-
- 最新评论 更多>>