this code finds if there is the string 2004 in <date_iso></date_iso>
and if it is so, I echo some data from that specific element that the search string was found.
I was wondering if this is the best/fastest approach because my main concern is speed and the XML file is huge. Thank you for your ideas.
this is a sample of the XML
<entry ID="4406">
<id>4406</id>
<title>Book Look Back at 2002</title>
<link>http://www.sebastian-bergmann.de/blog/archives/33_Book_Look_Back_at_2002.html</link>
<description></description>
<content_encoded></content_encoded>
<dc_date>20.1.2003, 07:11</dc_date>
<date_iso>2003年01月20日T07:11</date_iso>
<blog_link/>
<blog_title/>
</entry>
this is the code
<?php
$books = simplexml_load_file('planet.xml');
$search = '2004';
foreach ($books->entry as $entry) {
if (preg_match('/' . preg_quote($search) . '/i', $entry->date_iso)) {
echo $entry->dc_date;
}
}
?>
3 Answers 3
First of all, put up a timer so you know if things get better.
You're repeating '/' . preg_quote($search) . '/i'
for each book. You should create the search string only once or else you are wasting time:
<?php
$books = simplexml_load_file('planet.xml');
$search = '2004';
$regex = '/' . preg_quote($search) . '/i';
foreach ($books->entry as $entry) {
if (preg_match($regex, $entry->date_iso)) {
echo $entry->dc_date;
}
}
?>
If you are only looking for 2004 or similar you might analyze if simpler functions would be faster e.g. strpos
.
Also the /i
modifier might be unnecessary.
Be aware, that if you use strpos
you should use the !==
operator to check for an occurance of your haystack. If 2004
is at position 0 the !=
will evaluate true.
<?php
$books = simplexml_load_file('planet.xml');
$search = '2004';
foreach ($books->entry as $entry) {
if (strpos($entry->date_iso, $search) !== false) {
echo $entry->dc_date;
}
}
?>
This might not exactly be an answer you're looking for, but I would imagine that the function simplexml_load_file
is somewhat expensive on a large XML file since it creates a lot of objects containing information about the elements, attributes, values, contents, etc.
For that reason I would try finding matching elements with preg_match
first (match entire entry
tags that contain 2004 in their respective date_iso
tags), then either load just these matches with simplexml_load_file
to extract the desired information, or (possibly even better) use preg_match
to do the same thing.
-
\$\begingroup\$
simplexml
is simple,preg_match
(in a loop!) is way more expensive. \$\endgroup\$takeshin– takeshin2011年07月30日 06:46:00 +00:00Commented Jul 30, 2011 at 6:46 -
\$\begingroup\$ Are you sure about that? I would think it would depend on the XML and what exactly you're looking for in it. \$\endgroup\$newenglander– newenglander2011年08月01日 15:11:34 +00:00Commented Aug 1, 2011 at 15:11