I have an array of non-Equatable
objects. I want to remove all the duplicates from the array.
class SearchResult { // non-equatable
var index: Int!
var score: Int!
init(_ index: Int, score: Int = 0) {
self.index = index
self.score = score
}
}
var searchResults: [SearchResult] = [SearchResult(0), SearchResult(1), SearchResult(1), SearchResult(2), SearchResult(2)]
searchResults.map { 0ドル.index }
gives us: 0, 1, 1, 2, 2
Currently, I do so by mapping the objects' property index
, which is equatable, since it's an Int
. This way, I can remove the duplicates from the indices array:
let withDupes: [SearchResult] = searchResults
let indices = withDupes.map { 0ドル.index }.removingDuplicates()
For reference, .removingDuplicates()
comes from a post on StackOverflow
Note: this is the part I want to be reviewed.
Then, I get the object with the corresponding index by looping through the indices array and add it to the noDupes
array.
var noDupes: [SearchResult] = []
indices.forEach { (index) in
guard let notADupe = (withDupes.filter { 0ドル.index == index }).first else { return }
noDupes.append(notADupe)
}
noDupes.map { 0ドル.index }
now is: 0, 1, 2
Now, this code works, but it will be executed very often. Since I feel like this is not a very efficient way to remove the duplicates, I fear a performance drop.
How can I improve and / or simplify my code and still successfully remove the duplicates from the array?
2 Answers 2
First note that there is no need to make the SearchResult
properties (implicitly unwrapped) optionals, as they are always initialized:
class SearchResult { // non-equatable
var index: Int
var score: Int
init(_ index: Int, score: Int = 0) {
self.index = index
self.score = score
}
}
Your approach is indeed not optimal, after determining a unique set of property
values, the original array must be searched for each value separately. This lookup can be improved slightly by using first(where:)
var noDupes: [SearchResult] = []
indices.forEach { (index) in
guard let notADupe = withDupes.first(where: { 0ドル.index == index }) else { return }
noDupes.append(notADupe)
}
The advantage of first(where:)
over filter + first
is that it short-circuits and does not create an intermediate array.
But a better and faster solution would be to modify the removingDuplicates()
method so that it compares the array elements according to a given key (which is
then required to be Equatable
).
Here is a possible implementation, using the "Smart KeyPath" feature from Swift 4:
extension Array {
func removingDuplicates<T: Equatable>(byKey key: KeyPath<Element, T>) -> [Element] {
var result = [Element]()
var seen = [T]()
for value in self {
let key = value[keyPath: key]
if !seen.contains(key) {
seen.append(key)
result.append(value)
}
}
return result
}
}
This can then simply be used as
let searchResults: [SearchResult] = [SearchResult(0), SearchResult(1), SearchResult(1), SearchResult(2), SearchResult(2)]
let withoutDuplicates = searchResults.removingDuplicates(byKey: \.index)
It might also be worth to add another specialization for the case that the key is
Hashable
(as it is in your example) because then the seen
array can be replaced by a set,
which improves the lookup speed from O(N)
to O(1)
:
extension Array {
func removingDuplicates<T: Hashable>(byKey key: KeyPath<Element, T>) -> [Element] {
var result = [Element]()
var seen = Set<T>()
for value in self {
if seen.insert(value[keyPath: key]).inserted {
result.append(value)
}
}
return result
}
}
Note that if seen.insert(key).inserted
does the "test and insert if not present" with a single call.
Another (more flexible) option is to pass a closure to the function which determines the uniquing key for each element:
extension Array {
func removingDuplicates<T: Hashable>(byKey key: (Element) -> T) -> [Element] {
var result = [Element]()
var seen = Set<T>()
for value in self {
if seen.insert(key(value)).inserted {
result.append(value)
}
}
return result
}
}
Example:
let withoutDuplicates = searchResults.removingDuplicates(byKey: { 0ドル.index })
-
\$\begingroup\$ Hey, this looks awesome. Thanks so far! I'm getting an error on
.removingDuplicates(byKey: \.index)
sayingType of expression is ambiguous without more context
. Is the backslash intended or is that a typo? \$\endgroup\$LinusG.– LinusG.2018年03月19日 16:41:18 +00:00Commented Mar 19, 2018 at 16:41 -
1\$\begingroup\$ @LinusGeffarth: I don't know if the key paths work for implicitly unwrapped optionals. I have added another option (passing a closure) which is more flexible and that would work with an IUO as well. – The
removingDuplicates<T: Equatable>
should be faster because it traverses the array only once, whereas your solution traverses the array again for each key. \$\endgroup\$Martin R– Martin R2018年03月19日 21:50:08 +00:00Commented Mar 19, 2018 at 21:50 -
1\$\begingroup\$ Surely a filter with side effects is a lesser sin than reinventing the
filter
wheel \$\endgroup\$Alexander– Alexander2018年03月21日 19:20:44 +00:00Commented Mar 21, 2018 at 19:20 -
1\$\begingroup\$ @Alexander: Actually I should have known about that approach: stackoverflow.com/a/42542237/1187415 :) \$\endgroup\$Martin R– Martin R2018年03月22日 14:40:00 +00:00Commented Mar 22, 2018 at 14:40
-
1\$\begingroup\$ @MartinR We've all been there hah stackoverflow.com/a/40579948/3141234 \$\endgroup\$Alexander– Alexander2018年03月22日 15:53:35 +00:00Commented Mar 22, 2018 at 15:53
Here's a shorthand solution using reduce
so that you don't need the for-loop.
let items = itemsWithDuplicates.reduce([]) { array, item in !array.contains(where: { 0ドル.index == item.index }) ? array + [item] : array }
-
1\$\begingroup\$ Welcome to Code Review! You have presented an alternative solution, but haven't reviewed the code. Please edit to show what aspects of the question code prompted you to write this version, and in what ways it's an improvement over the original. It may be worth (re-)reading How to Answer. \$\endgroup\$Toby Speight– Toby Speight2022年02月11日 08:15:12 +00:00Commented Feb 11, 2022 at 8:15
-
\$\begingroup\$ Thanks, @TobySpeight. For a moment I thought this was stack overflow. I did a quick re-write of the explanation and will do a re-read of your terms soon. \$\endgroup\$Andres Canella– Andres Canella2022年02月12日 13:44:36 +00:00Commented Feb 12, 2022 at 13:44