php most efficient way to check if a variable contains only certain chars

Question 1

I have a small function which I regularly use to check if a variable contains only [a-z][0-9] and the special chars '-' and '_'. Currently I'm using the following:

function is_clean($string){
 $pattern = "/([a-z]|[A-Z]|[0-9]|-|_)*/";
 preg_match($pattern, $string, $return);
 $pass = (($string == $return[0]) ? TRUE : FALSE);
 return $pass;
}

1 - Can anyone tell me if this is the best/most efficient way to do this and if not try to explain why?

2 - When I view this function through my IDE I get a warning that $return is uninitialized... should I be initializing variables in advance with php, if so how and why?

Question 2

No need for all that. Just use one bracket group, negate it (be prepending a ^), and use the return value directly:

function is_clean ($string) {
 return ! preg_match("/[^a-z\d_-]/i", $string);
}

Here's a quote from the PHP docs:

Return Values
preg_match() returns 1 if the pattern matches given subject, 0 if it does not, or FALSE if an error occurred.

In the regex above, we're looking for any characters in the string that are not in the bracket group. If none are found, preg_match will return 0 (which when negated will result in true). If any of those characters are found, 1 will be returned and negated to false.

Question 3

Certainly cleaner looking code with this approach - thanks.

Question 4

Did some tests with a few different strings, preg_match('@[^a-z\d-_]@i', $string) is consistently slower than preg_match('@^[a-z\d-_]+$@i', $string).

Question 5

Just an other method without regex.

function is_clean ($string) {
{
 return ctype_alnum(str_replace(array('-', '_'), '', $input);
}

~~(削除) Maybe I find the time later this day to compare the performance, but I guess 'efficient' in your question was related to the code not the execution time? (削除ここまで)~~ Letharion did the work for me :)

Question 6

I spent some time with this yesterday. ctype alone is fastest, then comes short preg_match, ctype with str_replace, and finally the original. In many cases, ctype should be the best way to do this, as it will, unlike the regexp, work with characters like é. However, I didn't post an answer, because I couldn't get it to behave properly when trying out it.

Question 7

Maybe we should combine both approaches. Only if the plain ctype fails we use the regex. That might result in the best performance in the average if we can assume that there are not to many strings with - and _.

Question 8

Thanks for the suggestion - if I can get it working then probably a better than a regex as far as performance is concerned. Will have a play and come back

Joseph Silber Joseph Silber 9984 silver badges12 bronze badges · Answer 1 · 2013-03-13 18:15:18Z

No need for all that. Just use one bracket group, negate it (be prepending a ^), and use the return value directly:

function is_clean ($string) {
 return ! preg_match("/[^a-z\d_-]/i", $string);
}

Here's a quote from the PHP docs:

Return Values
preg_match() returns 1 if the pattern matches given subject, 0 if it does not, or FALSE if an error occurred.

In the regex above, we're looking for any characters in the string that are not in the bracket group. If none are found, preg_match will return 0 (which when negated will result in true). If any of those characters are found, 1 will be returned and negated to false.

Did some tests with a few different strings, preg_match('@[^a-z\d-_]@i', $string) is consistently slower than preg_match('@^[a-z\d-_]+$@i', $string).

mheinzerling mheinzerling 2,72415 silver badges17 bronze badges · Answer 2 · 2013-03-14 05:06:09Z

2

\$\begingroup\$

Just an other method without regex.

function is_clean ($string) {
{
 return ctype_alnum(str_replace(array('-', '_'), '', $input);
}

~~(削除) Maybe I find the time later this day to compare the performance, but I guess 'efficient' in your question was related to the code not the execution time? (削除ここまで)~~ Letharion did the work for me :)

Share

edited Mar 14, 2013 at 11:02

answered Mar 14, 2013 at 5:06

mheinzerling's user avatar

mheinzerling mheinzerling

2,72415 silver badges17 bronze badges

\$\endgroup\$

3

1

\$\begingroup\$ I spent some time with this yesterday. ctype alone is fastest, then comes short preg_match, ctype with str_replace, and finally the original. In many cases, ctype should be the best way to do this, as it will, unlike the regexp, work with characters like é. However, I didn't post an answer, because I couldn't get it to behave properly when trying out it. \$\endgroup\$

Letharion
– Letharion

2013年03月14日 07:37:32 +00:00
Commented Mar 14, 2013 at 7:37
\$\begingroup\$ Maybe we should combine both approaches. Only if the plain ctype fails we use the regex. That might result in the best performance in the average if we can assume that there are not to many strings with - and _. \$\endgroup\$

mheinzerling
– mheinzerling

2013年03月14日 11:00:55 +00:00
Commented Mar 14, 2013 at 11:00
\$\begingroup\$ Thanks for the suggestion - if I can get it working then probably a better than a regex as far as performance is concerned. Will have a play and come back \$\endgroup\$

SwiftD
– SwiftD

2013年03月15日 13:34:26 +00:00
Commented Mar 15, 2013 at 13:34

Add a comment |

Stack Exchange Network

php most efficient way to check if a variable contains only certain chars

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

php most efficient way to check if a variable contains only certain chars

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions