ǶȤ빽html쥤ԥ󥰤뤳Ȥ¿Τǥ⡣
ǽϤƤɡȤäƤΤϣѥ
٥ޡȤäƺ®äȤʤñ˴줫⡣

scrape_func.php

Υ饤꡼ܤΥץ륳ɤ˴ޤޤƤե롣
http://www.oreilly.co.jp/books/4873111870/download.html
Ƚ񤤤ľŪ˼äƤ롣
$_rawData = getURL($url);
$_rawData = mb_convert_encoding($_rawData, "UTF-8", "auto");
$_rawData = cleanString( $_rawData );

$headline = getBlock("<div id=\"headline\">","</div>",$_rawData,false);

$title = getElement("h1", $_rawData);

XPATH

xpathȤΤŪȤϻפɡä㤤Ⱦ˵󤲤饤֥ᤤ͡
Ǥ˸¤餺Ȥ뤫
Υ֥Υȥ㡣
$res = file_get_contents($url);
$dom = @DOMDocument::loadHTML($res);
$xml = simplexml_import_dom($dom);

//ǿΥȥ
$title = $xml->xpath("//div[@class='hentry']/h2/a/text()");
echo (string) current($title);

//3ܤεΥȥ
$title = $xml->xpath("//div[@class='hentry'][3]/h2/a/text()");
echo (string) current($title);
//3ܤεΥ
$link = $xml->xpath("//div[@class='hentry'][3]/h2/a/@href")

쥤ԥ󥰤äƻʺȤ͡
ƥȤȤˤܼۤưб