remove ALL duplicate elements from an array

Question 1

I have this code to remove duplicates (all occurrences) from an associative array, does PHP have methods to do this ? Or is there a way to improve the code ?

I looked for array_unique, array_search, array_map, array_reduce...

$articles = [
 [
 "id" => 0,
 "title" => "lorem",
 "reference" => "A"
 ],
 [
 "id" => 1,
 "title" => "ipsum",
 "reference" => "B"
 ],
 [
 "id" => 2,
 "title" => "dolor",
 "reference" => "C"
 ],
 [
 "id" => 3,
 "title" => "sit",
 "reference" => "A"
 ]
];
$references = array_column($articles, "reference");
$duplicates = array_values(array_unique(array_diff_assoc($references, array_unique($references))));
foreach($duplicates as $duplicate) {
 foreach($references as $index => $reference) {
 if($duplicate === $reference) {
 unset($articles[$index]);
 }
 }
}
/**
 * $articles = [
 * [
 * "id" => 1,
 * "title" => "ipsum",
 * "reference" => "B" 
 * ],
 * [
 * "id" => 2,
 * "title" => "dolor",
 * "reference" => "C" 
 * ]
 * ]
 */

Question 2

This task can and should be done with a single loop with no pre-looping variable population and no inefficient in_array() calls. Searching keys in php is always more efficient than searching values.

Code: (Demo)

$found = [];
foreach ($articles as $index => ['reference' => $ref]) {
 if (!isset($found[$ref])) {
 $found[$ref] = $index;
 } else {
 unset($articles[$index], $articles[$found[$ref]]);
 }
}
var_export($articles);

Output:

array (
 1 => 
 array (
 'id' => 1,
 'title' => 'ipsum',
 'reference' => 'B',
 ),
 2 => 
 array (
 'id' => 2,
 'title' => 'dolor',
 'reference' => 'C',
 ),
)

I am using array destructuring syntax in my foreach() for brevity and because I don't need the other column values.

Finally, it doesn't matter if there are triplicates (or more instances of a reference value), the script will handle these in the same fashion. unset() will not generate any notices, warnings, or errors if it is given a non-existent element (as a parameter) -- this is why it is safe to unconditionally unset the first found reference potentially multiple times.

Question 3

is there a way to improve the code ?

Instead of having two foreach loops:

foreach($duplicates as $duplicate) {
 foreach($references as $index => $reference) {
 if($duplicate === $reference) {
 unset($articles[$index]);
 }
 }
}

It can be simplified using in_array():

foreach($references as $index => $reference) {
 if (in_array($reference, $duplicates, true)) {
 unset($articles[$index]);
 }
}

While it would still technically have the same complexity (i.e. two loops) it would have one less indentation level, and utilize a built-in function to check if the reference is in the list of duplicate references.

Another solution would be to use array_flip() to map the last index to references, then loop through the articles and if the index of the current article does not match the index of the last reference (meaning its a duplicate) then remove both the article at the current index as well as the article at the last index that has the reference.

$references = array_column($articles, "reference");
$lastIndexes = array_flip($references);
foreach ($articles as $index => $article) {
 if ($lastIndexes[$article['reference']] !== $index) {
 unset($articles[$index], $articles[$lastIndexes[$article['reference']]]);
 }
}

Or to make it more readable, the last index can be assigned to a variable:

foreach($articles as $index => $article) {
 $lastIndex = $lastIndexes[$article['reference']];
 if ($lastIndex !== $index) {
 unset($articles[$index], $articles[$lastIndex]);
 }
}

mickmackusa mickmackusa 8,8021 gold badge17 silver badges31 bronze badges · Accepted Answer · 2021-11-02 22:34:06Z

This task can and should be done with a single loop with no pre-looping variable population and no inefficient in_array() calls. Searching keys in php is always more efficient than searching values.

Code: (Demo)

$found = [];
foreach ($articles as $index => ['reference' => $ref]) {
 if (!isset($found[$ref])) {
 $found[$ref] = $index;
 } else {
 unset($articles[$index], $articles[$found[$ref]]);
 }
}
var_export($articles);

Output:

array (
 1 => 
 array (
 'id' => 1,
 'title' => 'ipsum',
 'reference' => 'B',
 ),
 2 => 
 array (
 'id' => 2,
 'title' => 'dolor',
 'reference' => 'C',
 ),
)

I am using array destructuring syntax in my foreach() for brevity and because I don't need the other column values.

Finally, it doesn't matter if there are triplicates (or more instances of a reference value), the script will handle these in the same fashion. unset() will not generate any notices, warnings, or errors if it is given a non-existent element (as a parameter) -- this is why it is safe to unconditionally unset the first found reference potentially multiple times.

Stack Exchange Network

remove ALL duplicate elements from an array

2 Answers 2

is there a way to improve the code ?

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

remove ALL duplicate elements from an array

2 Answers 2

is there a way to improve the code ?

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions