Suppose you have the following HTML:
<style><input><div name="myDiv"></div></style>
You want to load it into a PHP DOMDocument object, how should you do it? If you use $doc->loadHTML() it will have the problem that the <div> is inside the <style> tag. If you use $doc->loadXML() it will have the problem that the <input> tag doesn't close.
Note: I can't edit the HTML, only the PHP used to parse it, because I'm scraping here.
2 Answers 2
Try this:
$doc = new DOMDocument;
$doc->recover = true;
$doc->loadXml($response);
The $doc->recover = true tells DOMDocument to try and parse non-well formed documents. See the documentation for more information.
Comments
Can't you turn the html into a string, explode it and then stitch it back with the closing tag?