I'm getting this error when running this code:
Fatal error: Uncaught exception 'DOMException' with message 'Invalid Character Error' in test.php:29 Stack trace: #0 test.php(29): DOMDocument->createElement('1OhmStable', 'a') #1 {main} thrown in test.php on line 29
The nodes that from the original XML file do contain invalid characters, but as I am stripping the invalid characters away from the nodes, the nodes should be created. What type of encoding do I need to do on the original XML document? Do I need to decode the saveXML?
function __cleanData($c)
{
return preg_replace("/[^A-Za-z0-9]/", "",$c);
}
$xml = new DOMDocument('1.0', 'UTF-8');
$xml->load('test.xml');
$xml->formatOutput = true;
$append = array();
foreach ($xml->getElementsByTagName('product') as $product )
{
foreach($product->getElementsByTagName('name') as $name )
{
$append[] = $name;
}
foreach ($append as $a)
{
$nodeName = __cleanData($a->textContent);
$element = $xml->createElement(htmlentities($nodeName) , 'a');
}
$product->removeChild($xml->getElementsByTagName('details')->item(0));
$product->appendChild($element);
}
$result = $xml->saveXML();
$file = "data.xml";
file_put_contents($file,$result);
This is what the original XML looks like:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="/v1/xsl/xml_pretty_printer.xsl" type="text/xsl"?>
<products>
<product>
<modelNumber>M100</modelNumber>
<itemId>1553725</itemId>
<details>
<detail>
<name>1 Ohm Stable</name>
<value>600 x 1</value>
</detail>
</details>
</product>
</products>
The new document is supposed to look like this:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="/v1/xsl/xml_pretty_printer.xsl" type="text/xsl"?>
<products>
<product>
<modelNumber>M100</modelNumber>
<itemId>1553725</itemId>
<1 Ohm Stable>
</1 Ohm Stable>
</product>
</products>
-
you are like talking to yourself, where is the XML?ajreal– ajreal2011年12月15日 17:27:15 +00:00Commented Dec 15, 2011 at 17:27
-
1Why you post the clean version?ajreal– ajreal2011年12月15日 17:35:51 +00:00Commented Dec 15, 2011 at 17:35
-
The xml you posted is after you removed the invalid characters? Why not you post the original version ?ajreal– ajreal2011年12月15日 17:39:20 +00:00Commented Dec 15, 2011 at 17:39
-
No that is the original version. I will post what the output is supposed to look like.Ryan– Ryan2011年12月15日 17:40:26 +00:00Commented Dec 15, 2011 at 17:40
-
let us continue this discussion in chatRyan– Ryan2011年12月15日 17:40:35 +00:00Commented Dec 15, 2011 at 17:40
4 Answers 4
Simply you can not use an element name start with number
1OhmStable <-- rename this
_1OhmStable <-- this is fine
php parse xml - error: StartTag: invalid element name
A nice article :- http://www.xml.com/pub/a/2001/07/25/namingparts.html
A Name is a token beginning with a letter or one of a few punctuation characters, and continuing with letters, digits, hyphens, underscores, colons, or full stops, together known as name characters.
Comments
You have not written where you get that error. In case it's after you cleaned the value, this is my guess:
preg_replace("/[^A-Za-z0-9]/", "",$c);
This replacement is not written for UTF-8 encoded strings (which are used by DOMDocument). You can make it UTF-8 compatible by using the u-modifier (PCRE8)Docs:
preg_replace("/[^A-Za-z0-9]/u", "",$c);
^
It's just a guess, I suggest you make it more precise in your question which part of your code triggers the error.
Comments
Even if __cleandata() will remove all other characters than latin alphabets a-z and numbers, it doesn't necessarily guarantee that the result is a valid XML name. Your function can return strings that begin with a number but numbers are illegal name start characters in XML, they can only appear in a name after the first name character. Also spaces are forbidden in names, so that is another point where your expected XML output would fail.
Comments
Make sure scripts have same encoding: if it's UTF make sure they are without Byte Order Mark (BOM) at very begin of file. To do that open your XML file with a text editor like Notepad++ and convert your file in "UTF-8 without BOM".
I had a similar error, but with a json file