[フレーム]
Last Updated: January 06, 2019
·
1.192K
· skyzyx

Less slow, case-insensitive, XPath lookups in PHP

PHP depends on libxml2 as its underlying XML parser. libxml2 supports XPath 1.0, but not newer versions. Because of this, performing case-insensitive queries (like when you're parsing a non-compliant RSS feed) needs to be done in userland.

Querying data out of an XML structure (with DOMDocument) can be up to ×ばつ (i.e., 10000%) faster using well-crafted XPath queries over "regular" PHP (e.g., looping, if conditionals). Most suggestions on the internet say to use XPath's translate() function to convert the entire alphabet, but this can be ×ばつ SLOWER (e.g., 800%). We can make this around 35% less slow so that it is only ×ばつ slower (450–500%) if we only convert the letters that are actually in the word.

This performance still isn't great, but is definitely better. Tested against PHP 7.2.

<?php
$word = 'rss';
$elementLetters = \count_chars($word, 3);
$lettersLower = \mb_strtolower($elementLetters);
$lettersUpper = \mb_strtoupper($elementLetters);

$query = \sprintf(
 '/*[translate(name(), \'%s\', \'%s\') = \'%s\']',
 $lettersUpper,
 $lettersLower,
 $word
);

# /*[translate(name(), 'RS', 'rs') = 'rss'
$results = $domxpath->query($query);

AltStyle によって変換されたページ (->オリジナル) /