(PHP 4, PHP 5, PHP 7, PHP 8)
strnatcmp — String comparisons using a "natural order" algorithm
This function implements a comparison algorithm that orders alphanumeric strings in the way a human being would, this is described as a "natural ordering". Note that this comparison is case sensitive.
string1
The first string.
string2
The second string.
Returns a value less than 0 if string1
is less than string2
; a value greater
than 0 if string1
is greater than
string2
, and 0
if they
are equal.
No particular meaning can be reliably inferred from the value aside
from its sign.
Version | Description |
---|---|
8.2.0 |
This function is no longer guaranteed to return
strlen($string1) - strlen($string2) when string lengths
are not equal, but may now return -1 or
1 instead.
|
An example of the difference between this algorithm and the regular computer string sorting algorithms (used in strcmp() ) can be seen below:
Example #1 strcmp()
<?php
$arr1 = $arr2 = array("img12.png", "img10.png", "img2.png", "img1.png");
echo "Standard string comparison\n";
usort($arr1, "strcmp");
print_r($arr1);
echo "\nNatural order string comparison\n";
usort($arr2, "strnatcmp");
print_r($arr2);
?>
The above example will output:
Standard string comparison Array ( [0] => img1.png [1] => img10.png [2] => img12.png [3] => img2.png ) Natural order string comparison Array ( [0] => img1.png [1] => img2.png [2] => img10.png [3] => img12.png )
Can also be used with combination of a compare for an array nested value, like
<?php
$array = array(
"city" => "xyz",
"names" => array(
array(
"name" => "Ana2",
"id" => 1
) ,
array(
"name" => "Ana1",
"id" => 2
)
)
);
usort($array["names"], function ($a, $b) { return strnatcmp($a['name'], $b['name']);} );
There seems to be a bug in the localization for strnatcmp and strnatcasecmp. I searched the reported bugs and found a few entries which were up to four years old (but the problem still exists when using swedish characters).
These functions might work instead.
<?php
function _strnatcasecmp($left, $right) {
return _strnatcmp(strtolower($left), strtolower($right));
}
function _strnatcmp($left, $right) {
while((strlen($left) > 0) && (strlen($right) > 0)) {
if(preg_match('/^([^0-9]*)([0-9].*)$/Us', $left, $lMatch)) {
$lTest = $lMatch[1];
$left = $lMatch[2];
} else {
$lTest = $left;
$left = '';
}
if(preg_match('/^([^0-9]*)([0-9].*)$/Us', $right, $rMatch)) {
$rTest = $rMatch[1];
$right = $rMatch[2];
} else {
$rTest = $right;
$right = '';
}
$test = strcmp($lTest, $rTest);
if($test != 0) {
return $test;
}
if(preg_match('/^([0-9]+)([^0-9].*)?$/Us', $left, $lMatch)) {
$lTest = intval($lMatch[1]);
$left = $lMatch[2];
} else {
$lTest = 0;
}
if(preg_match('/^([0-9]+)([^0-9].*)?$/Us', $right, $rMatch)) {
$rTest = intval($rMatch[1]);
$right = $rMatch[2];
} else {
$rTest = 0;
}
$test = $lTest - $rTest;
if($test != 0) {
return $test;
}
}
return strcmp($left, $right);
}
?>
The code is not optimized. It was just made to solve my problem.
This function has some interesting behaviour on strings consisting of mixed numbers and letters.
One may expect that such a mixed string would be treated as alpha-numeric, but that is not true.
var_dump(strnatcmp('23','123')); →
int(-1)
As expected, 23<123 (even though first digit is higher, overall number is smaller)
var_dump(strnatcmp('yz','xyz')); →
int(1)
As expected, yz>xyz (string comparison, irregardless of string length)
var_dump(strnatcmp('2x','12y')); →
int(-1)
Remarkable, 2x<12y (does a numeric comparison)
var_dump(strnatcmp('20x','12y'));
int(1)
Remarkable, 20x>12y (does a numeric comparison)
It seems to be splitting what is being compared into runs of numbers and letters, and then comparing each run in isolation, until it has an ordering difference.
Some more remarkable outcomes:
var_dump(strnatcmp("0.15m", "0.2m"));
int(1)
var_dump(strnatcmp("0.15m", "0.20m"));
int(-1)
It's not about localisation:
var_dump(strnatcmp("0,15m", "0,2m"));
int(1)
var_dump(strnatcmp("0,15m", "0,20m"));
int(-1)