php - Strip out first IMG elements in an HTML block -
i have php app grabs html 3rd party sources, html may come 1 or more img elements in it. want grab first img instance in it's entirety, not sure how go that.
can force me in right direction?
thanks.
you utilize xpath parse html, , pull out info want way. it's little more involved string position checking, has advantage of beingness bit more robust should decide want more specific (src
, alt
of first img
tag, example).
first load html string in domdocument, loaded in xpath.
// load html in domdocument, set xpath $doc = new domdocument(); $doc->loadhtml($html); $xpath = new domxpath($doc);
we want first img
occurs on page, utilize selector /descendant::img[1]
. n.b, not same //img[1]
, though may give similar results. there's explanation here on difference between two.
$matches = $xpath->evaluate("/descendant::img[1]");
a downside of using xpath doesn't create easy "give me total string matched img
tag", can set simple function that'll iterate on matched node's attributes , re-build img
tag.
$tag = "<img "; foreach ($node->attributes $attr) { $vals[] = $attr->name . '="' . $attr->value . '"'; } $tag .= implode(" ", $vals) . " />";
putting like:
<?php // illustration html $html = '<html><body>' . ' <img src="/images/my-image.png" alt="my image" width="100" height="100" />' . 'some text here <img src="do-not-want-second.jpg" alt="no thanks" />'; // load html in domdocument, set xpath $doc = new domdocument(); $doc->loadhtml($html); $xpath = new domxpath($doc); // first img in doc // n.b. not same "//img[1]" - see http://stackoverflow.com/a/453902/2287 $matches = $xpath->evaluate("/descendant::img[1]"); foreach ($matches $match) { echo buildimgtag($match); } /** * build img tag given it's matched node * * @param domelement $node img node * * @return rebuilt img tag */ function buildimgtag($node) { $tag = "<img "; $vals = array(); foreach ($node->attributes $attr) { $vals[] = $attr->name . '="' . $attr->value . '"'; } $tag .= implode(" ", $vals) . " />"; homecoming $tag; }
```
so overall it's more complex approach doing strpos
or regex on html, should provide more flexibility should decide img
tag, pulling out specific attribute.
php html
No comments:
Post a Comment