2
\$\begingroup\$

this code finds if there is the string 2004 in <date_iso></date_iso> and if it is so, I echo some data from that specific element that the search string was found.

I was wondering if this is the best/fastest approach because my main concern is speed and the XML file is huge. Thank you for your ideas.

this is a sample of the XML

<entry ID="4406">
 <id>4406</id>
 <title>Book Look Back at 2002</title>
 <link>http://www.sebastian-bergmann.de/blog/archives/33_Book_Look_Back_at_2002.html</link>
 <description></description>
 <content_encoded></content_encoded>
 <dc_date>20.1.2003, 07:11</dc_date>
 <date_iso>2003年01月20日T07:11</date_iso>
 <blog_link/>
 <blog_title/>
</entry>

this is the code

<?php
$books = simplexml_load_file('planet.xml');
$search = '2004';
foreach ($books->entry as $entry) {
 if (preg_match('/' . preg_quote($search) . '/i', $entry->date_iso)) {
 echo $entry->dc_date;
 }
}
?>
asked Jul 22, 2011 at 3:17
\$\endgroup\$

3 Answers 3

2
\$\begingroup\$

First of all, put up a timer so you know if things get better.

You're repeating '/' . preg_quote($search) . '/i'for each book. You should create the search string only once or else you are wasting time:

<?php
$books = simplexml_load_file('planet.xml');
$search = '2004';
$regex = '/' . preg_quote($search) . '/i';
foreach ($books->entry as $entry) {
 if (preg_match($regex, $entry->date_iso)) {
 echo $entry->dc_date;
 }
}
?>

If you are only looking for 2004 or similar you might analyze if simpler functions would be faster e.g. strpos.

Also the /i modifier might be unnecessary.

answered Jul 22, 2011 at 3:52
\$\endgroup\$
2
\$\begingroup\$

Be aware, that if you use strpos you should use the !== operator to check for an occurance of your haystack. If 2004 is at position 0 the != will evaluate true.

<?php
$books = simplexml_load_file('planet.xml');
$search = '2004';
foreach ($books->entry as $entry) {
 if (strpos($entry->date_iso, $search) !== false) {
 echo $entry->dc_date;
 }
}
?>
answered Jul 23, 2011 at 11:14
\$\endgroup\$
1
\$\begingroup\$

This might not exactly be an answer you're looking for, but I would imagine that the function simplexml_load_file is somewhat expensive on a large XML file since it creates a lot of objects containing information about the elements, attributes, values, contents, etc.

For that reason I would try finding matching elements with preg_match first (match entire entry tags that contain 2004 in their respective date_iso tags), then either load just these matches with simplexml_load_file to extract the desired information, or (possibly even better) use preg_match to do the same thing.

answered Jul 26, 2011 at 14:41
\$\endgroup\$
2
  • \$\begingroup\$ simplexml is simple, preg_match (in a loop!) is way more expensive. \$\endgroup\$ Commented Jul 30, 2011 at 6:46
  • \$\begingroup\$ Are you sure about that? I would think it would depend on the XML and what exactly you're looking for in it. \$\endgroup\$ Commented Aug 1, 2011 at 15:11

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.