I need to check the end of a URL for the possible existence of /news_archive or /news_archive/5 in PHP. The below snippet does exactly what I want, but I know that I could achieve this with one preg_match
rather than two. How can I improve this code to treat the /5 as an optional segment and capture it if it exists?
if (preg_match('~/[0-9A-Za-z_-]+_archive/[0-9]+$~', $_SERVER['HTTP_REFERER'], $matches) || preg_match('~/[0-9A-Za-z_-]+_archive$~', $_SERVER['HTTP_REFERER'], $matches)) {
$page_info['parent_page']['page_label'] = ltrim($matches[0], '/');
}
1 Answer 1
Consider your first pattern:
~/[0-9A-Za-z_-]+_archive/[0-9]+$~
Let's break it down:
/
a literal string/
[0-9A-Za-z_-]+
one or more of0-9
,A-Z
,a-z
,_
or-
_archive
a literal string_archive
/
literal slash again[0-9]+
one or more digits$
the end of the string must follow the one or more digits
So basically you want to make #4 and #5 optional. To be more specific, you want either both 4 and 5, or neither 4 nor 5.
Consider this:
(a[b]+)?
This means that you have one a
followed by one or more b
, and that this grouped a/b entity is optional.
Letting a be #4 and b be digits like in #5, we're left with:
(/[0-9]+)?
Or:
~/[0-9A-Za-z_-]+_archive(/[0-9]+)?$~
This will capture the entire group though, like /5
:
php -r "preg_match('~/[0-9A-Za-z_-]+_archive(/([0-9]+))?$~', '/news_archive/5', $m); var_dump($m);"
array(2) {
[0] =>
string(15) "/news_archive/5"
[1] =>
string(2) "/5"
}
You can just add another group to remedy that though:
~/[0-9A-Za-z_-]+_archive(/([0-9]+))?$~
Example:
php -r "preg_match('~/[0-9A-Za-z_-]+_archive(/([0-9]+))?$~', '/news_archive/44', $m); var_dump($m);"
array(3) {
[0] =>
string(16) "/news_archive/44"
[1] =>
string(3) "/44"
[2] =>
string(2) "44"
}
You could technically make the outside group a non-capturing group (like (?:/([0-9]+))?
), but I don't think the added complication is worth not grabbing the /
part too.
(By the way, sorry if you're familiar with regex and you found this excessive. I tend to take a very verbose approach to any regex related question :).)
-
\$\begingroup\$ This is a fantastic response, and certainly more than I expected. In a good way. I am fairly unfamiliar with regex itself, so the thorough analysis was a pleasant and refreshing lesson! Thank you. \$\endgroup\$davo0105– davo01052012年10月05日 22:43:37 +00:00Commented Oct 5, 2012 at 22:43
-
\$\begingroup\$ @davo0105 Glad I could help! :) \$\endgroup\$Corbin– Corbin2012年10月06日 03:58:53 +00:00Commented Oct 6, 2012 at 3:58