最近遇到个项目,客户非要加个天气预报功能,还提出要求最少显示三天。到网上找了很多挂件都无法找到满足要求的,这可让老夫如何是好~~~于是乎被逼无耐下,自己写了这个抓取类。
主要功能只是将所须省市的七天的天气抓取下来,目前我这个项目中应用了缓存,但因为是FLEA的缓存机制,所以把代码贴出来对不使用FLEA的同学完全没有帮助,所以就省了点代码。
关于缓存方面,我个人比较倾向于按你读取的天数来确定缓存时间,比如:你一次只显示今天和明天的,那抓取来的七天数据就可以使用六次,直到七天数据用完的时候才需要重新抓取。
这样处理缓存我个人觉得比较划算,呵呵。
[PS]: 贴代码是个辛苦活,记得以前有些几个网站是可以分享代码片段的,但忘了名字,有同学愿意告诉一下我吗?
先上源码:
<?php
class Model_Weather
{
private $_server = 'http://qq.ip138.com';
private $_ext = '.htm';
private $_province;
private $_city;
function __construct($province='guangdong', $city='zhongshan')
{
$this->_province = $province;
$this->_city = $city;
}
function setServer($server)
{
$this->_server = $server;
}
function setProvince($province)
{
$this->_province = $province;
}
function setCity($city)
{
$this->_city = $city;
}
function setExt($ext)
{
$this->_ext = $ext;
}
function getPageLink()
{
return $this->_server . '/weather/' .
$this->_province . '/' .
$this->_city .
$this->_ext;
}
function getIconLink($icon)
{
return $this->_server . $icon;
}
// 最好是使用缓存,不然比较废时。
function fetch($display = 3)
{
$weather = $this->_fetch();
for ($i=0; $i<$display; $i++)
{
$return[] = $weather[$i];
}
return $return;
}
function _fetch()
{
$content = $this->fopen_url($this->getPageLink());
$match = $this->find("<table width="700" borderColorDark="#ffffff" borderColorLight="#008000" border="1" cellspacing="0" cellpadding="1" align="center"", "table>", $content);
$table = $this->findAll("<tr", "tr>", $match);
$dates = $this->getDate($table[0]);
$icons = $this->getIcons($table[1]);
$temperatures = $this->getTemperature($table[2]);
foreach ($dates as $i => $date)
{
$return[$i] = array(
'date' => explode(' ', $date),
'icons' => $icons[$i],
'temperature' => $temperatures[$i],
);
}
return $return;
}
// 日期
function getDate($table)
{
$dates = $this->findAll("<th class="tdc1" style="white-space:nowrap;"", "td>", $table);
return array_map('strip_tags', $dates);
}
// 天气图标及文字
function getIcons($table)
{
$tds = $this->findAll("<td", "td>", $table);
array_shift($tds);
foreach ($tds as $i => $td)
{
$t = explode('<br/>', $td);
$r1 = "/src="(.*?)"/is";
preg_match_all($r1, $t[0], $icons);
$icon = array_map(array($this, 'getIconLink'), $icons[1]);
$rows[$i] = array(
'text' => strip_tags($t[1]),
'images' => $icon
);
}
return $rows;
}
// 气温
function getTemperature($table)
{
$dates = $this->findAll("<td", "td>", $table);
array_shift($dates);
return array_map('strip_tags', $dates);
}
function find($begin, $end, $content)
{
$match = '';
$r = "/{$begin}(.*?){$end}/is";
preg_match($r, $content, $match);
return $match[0];
}
function findAll($begin, $end, $content)
{
$matchs = '';
$r = "/{$begin}(.*?){$end}/is";
preg_match_all($r, $content, $matchs);
return $matchs[0];
}
function fopen_url($url)
{
if (function_exists('file_get_contents')) {
$file_content = @file_get_contents($url);
} elseif (ini_get('allow_url_fopen') && ($file = @fopen($url, 'rb'))) {
$i = 0;
while (!feof($file) && $i++ < 1000) {
$file_content .= strtolower(fread($file, 4096));
}
fclose($file);
} elseif (function_exists('curl_init')) {
$curl_handle = curl_init();
curl_setopt($curl_handle, CURLOPT_URL, $url);
curl_setopt($curl_handle, CURLOPT_CONNECTTIMEOUT,2);
curl_setopt($curl_handle, CURLOPT_RETURNTRANSFER,1);
curl_setopt($curl_handle, CURLOPT_FAILONERROR,1);
curl_setopt($curl_handle, CURLOPT_USERAGENT, 'Trackback Spam Check'); //引用垃圾邮件检查
$file_content = curl_exec($curl_handle);
curl_close($curl_handle);
} else {
$file_content = '';
}
return iconv("gb2312", "utf-8",$file_content);
}
}
?>
返回数据结构:
Array
(
[0] => Array
(
[date] => Array
(
[0] => 2011-5-6
[1] => 星期五
)
[icons] => Array
(
[text] => 阴
[images] => Array
(
[0] => http://qq.ip138.com/image/b2.gif
)
)
[temperature] => 27℃~22℃
)
[1] => Array
(
[date] => Array
(
[0] => 2011-5-7
[1] => 星期六
)
[icons] => Array
(
[text] => 多云
[images] => Array
(
[0] => http://qq.ip138.com/image/b1.gif
)
)
[temperature] => 29℃~22℃
)
[2] => Array
(
[date] => Array
(
[0] => 2011-5-8
[1] => 星期日
)
[icons] => Array
(
[text] => 多云
[images] => Array
(
[0] => http://qq.ip138.com/image/b1.gif
)
)
[temperature] => 30℃~23℃
)
[3] => Array
(
[date] => Array
(
[0] => 2011-5-9
[1] => 星期一
)
[icons] => Array
(
[text] => 阵雨
[images] => Array
(
[0] => http://qq.ip138.com/image/b3.gif
)
)
[temperature] => 29℃~21℃
)
[4] => Array
(
[date] => Array
(
[0] => 2011-5-10
[1] => 星期二
)
[icons] => Array
(
[text] => 阵雨转多云
[images] => Array
(
[0] => http://qq.ip138.com/image/b3.gif
[1] => http://qq.ip138.com/image/b1.gif
)
)
[temperature] => 27℃~21℃
)
[5] => Array
(
[date] => Array
(
[0] => 2011-5-11
[1] => 星期三
)
[icons] => Array
(
[text] => 多云
[images] => Array
(
[0] => http://qq.ip138.com/image/b1.gif
)
)
[temperature] => 28℃~21℃
)
[6] => Array
(
[date] => Array
(
[0] => 2011-5-12
[1] => 星期四
)
[icons] => Array
(
[text] => 多云
[images] => Array
(
[0] => http://qq.ip138.com/image/b1.gif
)
)
[temperature] => 28℃
)
)
版权所有,转载请注明出处。
Tags: PHP , 爬虫
转载自 <a href="http://www.movoin.com/php-weather-class.html" title="PHP抓取天气预报" rel="bookmark">PHP抓取天气预报 | Movoin Studio</a>


开源中国可双贴代码片段 http://www.oschina.net/
@boyso 谢谢,试试先!