I'm currently calculating the unique permutations of an array of data. While the following code is working, it's not as efficient as I would like. Once I get over 6 or 8 items, it becomes very slow and I start running into memory issues.
Here is the code and an explanation
<?php
function permuteUnique($items, $count = false, $perms = [], &$return = []) {
if ($count && count($return) == $count) return $return;
if (empty($items)) {
$duplicate = false;
foreach ($return as $a) {
if ($a === $perms) {
$duplicate = true;
break;
}
}
if (!$duplicate) $return[] = $perms;
} else {
for ($i = count($items) - 1; $i >= 0; --$i) {
$newitems = $items;
$newperms = $perms;
list($tmp) = array_splice($newitems, $i, 1);
array_unshift($newperms, $tmp);
permuteUnique($newitems, $count, $newperms, $return);
}
return $return;
}
}
function factorial($n) {
$f = 1;
for ($i = 2; $i <= $n; $i++) $f *= $i;
return $f;
}
Given the input [1, 1, 2] I receive the following output as expected
array (size=3)
0 =>
array (size=3)
0 => int 1
1 => int 1
2 => int 2
1 =>
array (size=3)
0 => int 1
1 => int 2
2 => int 1
2 =>
array (size=3)
0 => int 2
1 => int 1
2 => int 1
The $count parameter is so I can pass the number of unique permutations I expect to the function and once it has found that many, it can stop calculating and return the data. This is calculated as the factorial of the total number of items divided by the product of the factorial of the count of all duplicates. I'm not sure I said that right so let me show you an example.
Given the set [1, 2, 2, 3, 4, 4, 4, 4] the count of unique permutations is calculated as
8! / (2!4!) = 840 because there are 8 total items, one of them is duplicated twice, and another is duplicated 4 times.
Now if I translate that to php code...
<?php
$set = [1, 2, 2, 3, 4, 4, 4, 4];
$divisor = 1;
foreach (array_count_values($set) as $v) {
$divisor *= factorial($v);
}
$count = factorial(count($set)) / $divisor;
$permutations = permuteUnique($set, $count);
it's pretty slow. If I throw a counter into the permuteUnique function, it runs over 100k times before it finds the 840 unique permutations.
I would like to find a way to reduce this and find the shortest possible path to the unique permutations. I appreciate any help or advice you can give.
4 Answers 4
So I spent some more time thinking about this and here is what I came up with.
<?php
function permuteUnique($items, $perms = [], &$return = []) {
if (empty($items)) {
$return[] = $perms;
} else {
sort($items);
$prev = false;
for ($i = count($items) - 1; $i >= 0; --$i) {
$newitems = $items;
$tmp = array_splice($newitems, $i, 1)[0];
if ($tmp != $prev) {
$prev = $tmp;
$newperms = $perms;
array_unshift($newperms, $tmp);
permuteUnique($newitems, $newperms, $return);
}
}
return $return;
}
}
$permutations = permuteUnique([1, 2, 2, 3, 4, 4, 4, 4]);
Previous stats
Uniques: 840
Calls to permuteUnique: 107,591
Duplicates found: 38737
Execution time (seconds): 4.898668050766
New stats
Uniques: 840
Calls to permuteUnique: 2647
Duplicates found: 0
Execution time (seconds): 0.0095300674438477
So all I really did was sort the data set, keep track of the previous item, and not calculate permutations if the current item matched the previous. I also no longer have to pre calculate the amount of uniques and iterate through the permutations to check for duplicates. That made a world of difference.
I just tried the "Generation in lexicographic order" way on the wiki, and it generates the same result for your "1,2,2,3,4,4,4,4" sample, so I guess it is correct. Here is the code:
function &permuteUnique($items) {
sort($items);
$size = count($items);
$return = [];
while (true) {
$return[] = $items;
$invAt = $size - 2;
for (;;$invAt--) {
if ($invAt < 0) {
break 2;
}
if ($items[$invAt] < $items[$invAt + 1]) {
break;
}
}
$swap1Num = $items[$invAt];
$inv2At = $size - 1;
while ($swap1Num >= $items[$inv2At]) {
$inv2At--;
}
$items[$invAt] = $items[$inv2At];
$items[$inv2At] = $swap1Num;
$reverse1 = $invAt + 1;
$reverse2 = $size - 1;
while ($reverse1 < $reverse2) {
$temp = $items[$reverse1];
$items[$reverse1] = $items[$reverse2];
$items[$reverse2] = $temp;
$reverse1++;
$reverse2--;
}
}
return $return;
}
Profiling the time for your example input: the above method: 2600,3000,3000,2400,2400,3000; your "Calls to permuteUnique: 2647" method: 453425.6,454425.4,454625.8. In your this example input, it is around 500 times faster :) If you are processing the result one by one (I guess you will), using this non-recursive method, you can process one generated and then generate the next (instead of generating all and store all before processing).
2 Comments
Try this modificated iterative version. It does not have the recursive overhead.
Found on: http://docstore.mik.ua/orelly/webprog/pcook/ch04_26.htm
ORIGINAL:
function pc_next_permutation($p, $size) {
// slide down the array looking for where we're smaller than the next guy
for ($i = $size - 1; $p[$i] >= $p[$i+1]; --$i) { }
// if this doesn't occur, we've finished our permutations
// the array is reversed: (1, 2, 3, 4) => (4, 3, 2, 1)
if ($i == -1) { return false; }
// slide down the array looking for a bigger number than what we found before
for ($j = $size; $p[$j] <= $p[$i]; --$j) { }
// swap them
$tmp = $p[$i]; $p[$i] = $p[$j]; $p[$j] = $tmp;
// now reverse the elements in between by swapping the ends
for (++$i, $j = $size; $i < $j; ++$i, --$j) {
$tmp = $p[$i]; $p[$i] = $p[$j]; $p[$j] = $tmp;
}
return $p;
}
$set = split(' ', 'she sells seashells'); // like array('she', 'sells', 'seashells')
$size = count($set) - 1;
$perm = range(0, $size);
$j = 0;
do {
foreach ($perm as $i) { $perms[$j][] = $set[$i]; }
} while ($perm = pc_next_permutation($perm, $size) and ++$j);
foreach ($perms as $p) {
print join(' ', $p) . "\n";
}
Here is one idea to modify it to distinct permutations, but I think there are faster solutions....
function pc_next_permutation($p, $size) {
for ($i = $size - 1; $p[$i] >= $p[$i+1]; --$i) { }
if ($i == -1) { return false; }
for ($j = $size; $p[$j] <= $p[$i]; --$j) { }
$tmp = $p[$i]; $p[$i] = $p[$j]; $p[$j] = $tmp;
for (++$i, $j = $size; $i < $j; ++$i, --$j) {
$tmp = $p[$i]; $p[$i] = $p[$j]; $p[$j] = $tmp;
}
return $p;
}
$uniqueMap=array();
$set = split(' ', '1 2 2 3 4 4 4 4');
$size = count($set) - 1;
$perm = range(0, $size);
$j=0;
do {
$uniqueSetString="";
foreach ($perm as $i)
$uniqueSetString .= "|".$set[$i];
if (!isset($uniqueMap[$uniqueSetString]))
{
foreach ($perm as $i)
$perms[$j][] = $set[$i];
$uniqueMap[$uniqueSetString]=1;
}
} while ($perm = pc_next_permutation($perm, $size) and ++$j);
foreach ($perms as $p) {
print join(' ', $p) . "\n";
}
2 Comments
What you need is the factoriadic, it allows you to generate the nth permutation without having to all the previous / following ones. I've coded it in PHP but I don't have it with me ATM, sorry.
EDIT: Here you go, it should get you started.
std::next_permutationfor C++ and find or implement something like this for PHP.