Scraping website with DOM and XML in PHP -
i trying list of links webpage php. i've tried:
$webpage = file_get_contents('http://cl1.php.net/manual/en/function.call-user-func-array.php'); $dom = new domdocument(); $dom->loadhtml($webpage); $xpath = new domxpath($dom); $links = $xpath->query('aside/ul/li/ul/li/a');//returns nil foreach ($links $link) { echo $link->getattribute('href'); }
the code works until has perform query, when returns empty object.
i've tried solve aforementioned problem:
$dom->getelementsbytagname('aside')->childnodes->item(0)->childnodes->item(0)->childnodes->item(1)->childnodes->item(0)->childnodes->item(0)->childnodes;
i know lastly code doesn't homecoming elements, but, so, doesn't work.
edit:
this part of html:
<aside class='layout-menu'> <ul class='parent-menu-list'> <li> <a href="ref.funchand.php">function handling functions</a> <ul class='child-menu-list'> <li class="current"> <a href="function.call-user-func-array.php" title="call_​user_​func_​array">call_​user_​func_​array</a> </li>
i don't see how query match. using relative query on entire document, in essence doing relative query document root.
try either specify query root node like:
// instantiate domxpath $xpath = new domxpath($dom); // utilize total path hierarchy in query $links = $xpath->query('/html/body/.../aside/ul/li/ul/li/a');
or pass aside
node context xpath utilize relative query.
// domnode object aside element $aside_tag = $dom->getelementsbytagname('aside')->item(0); // instantiate domxpath $xpath = new domxpath($dom); // pass domnode context domxpath::query() $links = $xpath->query('ul/li/ul/li/a', $aside_tag);
php xml dom
No comments:
Post a Comment