加入收藏 | 设为首页 | 会员中心 | 我要投稿 航空爱好网 (https://www.dakongjun.com/)- 科技、建站、经验、云计算、5G、大数据,站长网!
当前位置: 首页 > 站长学院 > PHP教程 > 正文

PHP可以灵活配置使用的采集器

发布时间:2022-10-21 15:22:38 所属栏目:PHP教程 来源:
导读:  if (strpos($val, 'http:') === false)

  {

  if (substr($val, 0, 1) == '/')

  $val = WEB_HOST.$val;

  else

  $val = WEB_HOST.'/'.$val;

  }
  if (strpos($val, 'http:') === false)
 
  {
 
  if (substr($val, 0, 1) == '/')
 
  $val = WEB_HOST.$val;
 
  else
 
  $val = WEB_HOST.'/'.$val;
 
  }
 
  $web_content = $this->get($val);
 
  if (empty($web_content))
 
  {
 
  $this->write('抓取的内容页为空,所以过滤掉');
 
  continue;
 
  }
 
  $web_content = str_replace("\r", '', $web_content);
 
  $web_content = str_replace("\n", '【】', $web_content);
 
  $sql = "INSERT INTO ".TABLE_NAME."(".implode(', ', array_keys($table_mapping)).")VALUES(";
 
  foreach ($table_mapping as $field => $reg)
 
  $sql .= ':'.$field.',';
 
  $sql = substr($sql ,0, -1);
 
  $sql .= ')';
 
  if (IS_DEBUG)
 
  $this->write('执行SQL '.$sql);
 
  $dsn = 'mysql:dbname='.DB_NAME.';host='.DB_HOST;
 
  try {
 
  $dbh = new PDO($dsn, DB_USER, DB_PWD);
 
  } catch (PDOException $e) {
 
  $this->write( 'Connection failed: ' . $e->getMessage(), true);
 
  }
 
  $dbh->query("set names 'utf8'");
 
  $sth = $dbh->prepare($sql);
 
  foreach ($table_mapping as $field => $reg)
 
  {
 
  if (substr($reg, 0, 1) != '/')
 
  {
 
  $$field = $reg;
 
  }
 
  else
 
  {
 
  if (!preg_match($reg, $web_content, $tmp_match))
 
  {
 
  $this->write('对不起,匹配字段:'.$field.'失败,过滤此记录');
 
  continue 2;
 
  }
 
  $$field = $tmp_match[1];
 
  $$field = $this->closetags($$field);
 
  //删除javascript脚本
 
  $$field = preg_replace('/
 

(编辑:航空爱好网)

【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!