3

I'm currently fetching a lot of objects that contains both names and coordinates of streets. The returned array has around 22.000 objects and the resulting array we want has around 4000, the rest are duplicates. The problem with this kind of data is that the fetched objects can have the same name but different coordinates, and i'm only interested by getting objects based on unique names. If there are more than one object with the same name, i'll only want to keep the first object.

So far i've been trying to loop through the streets by comparing the names. I'd rather use filter or some other more performance efficient solution.

My struct

struct StreetName {
 var name: String
 var polyLine: CLLocationCoordinate2D
}

My code so far

DataManager.shared.getStreetNames { (streets) in 
 var namesArray: [StreetName] = []
 for streetName in streets {
 let name = streetName.name
 if namesArray.count == 0 {
 namesArray.append(streetName)
 } else if namesArray.contains(where: {0ドル.name == name }) { 
 /* Dont add */ 
 } else {
 namesArray.append(streetName)
 }
 }
 self.streetNames = namesArray.sorted(by: {0ドル.name < 1ドル.name})
 self.filteredStreetNames = self.streetNames
 OperationQueue.main.addOperation {
 self.streetTableView.reloadData()
 }
}

This code block works, but runs in around 30 seconds on an iPhone X. Which is way too slow. Any ideas?

asked Apr 27, 2018 at 7:33
9
  • 1
    I suggest this question is reopened, the proposed duplicate addresses removing duplicates, but doesn't address the efficiency issue. Commented Apr 27, 2018 at 7:40
  • Why did you remove the sort portion of your code? Commented Apr 27, 2018 at 7:53
  • Reverted the edit Commented Apr 27, 2018 at 7:54
  • Perhaps, use a query that will return a sorted array of streets straight from your database. Removing the duplicates is then easy with a forward iterator or use a tailored reduce function. Commented Apr 27, 2018 at 7:57
  • 1
    This is what i meant: Clean way to filter an array of dictionaries by unique dictionary values Commented Apr 27, 2018 at 8:56

3 Answers 3

2

I think if you profile this, you'll find that the sort is taking the most time. I can't find an official note, but there's a good chance that the underlying implementation is quick sort, which has it's worst complexity when the array is already sorted (or the array is sorted in reverse order).

The average case complexity for quick sort is O(n log n), but in the worst case it's O(n2).

I think you should implement insertion sort instead, or, more accurately, always insert the new elements into an already sorted position. This should reduce your complexity to O(n) for the entire function.

Pseudocode:

  • Fetch street names
  • For each street name
    • find the position in the existing array where the street name would go (I suggest binary search since the array is already sorted)
    • if the street name already exists, skip
    • if the name doesn't exist, insert it.

The result should be a sorted array of unique street names requiring each name to only be read and inserted once.

answered Apr 27, 2018 at 7:49

6 Comments

If the array is already "somewhat" sorted which is not optimal for quick sort, how about shuffle then quick sort then reduce?
I also like the idea of this, but have a hard time trying to figure out how to make this pseudo code into code. Any pointers?
In the suggested duplicate the answer of mxcl provides the most efficient way to remove duplicates. By comparison the time to sort the array is negligible.
Swift uses introsort (a variant of quick sort), compare stackoverflow.com/q/27677026/1187415. Whether an already sorted array is the worst case or not depends on the choice of the pivot element. Apparently Swift uses the middle value from the first, middle, last element.
@vadian - Doesn't this compare equal objects? The objects in this case is unique, but some values might be same...
|
1

My take on this:

// Given an array of elements (here just Ints):
let array = (0..<1000).map { _ in Int(arc4random_uniform(100)) }
// Sort it:
let sorted = array.sorted()
// Define an empty result (array of elements) which is a variable 
// and which gets modified in the subsequent reduce function:
var unique: [Int] = []
// A tailored reduce which depends on a sorted array and appends 
// to the result IFF that element is not the last in result:
let result = sorted.reduce(into: unique) { (result, element) in
 if let last = result.last, last == element {
 } else {
 result.append(element)
 }
}

Finally, print the result:

print(array)

Example output on the console: console [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]

answered Apr 27, 2018 at 8:43

Comments

1

@MartinR solved this by using Sets.

My new updated struct

struct StreetName: Hashable {
 static func == (lhs: StreetName, rhs: StreetName) -> Bool {
 return lhs.name == rhs.name
 }
 var hashValue: Int {
 return name.hashValue
 }
 var name: String
 var polyLine: CLLocationCoordinate2D
}

My new updated code

DataManager.shared.getStreetNames { (returnedNamesSet) in
 var namesArray: [StreetName] = Array(returnedNamesSet)
 self.streetNames = namesArray.sorted(by: {0ドル.name < 1ドル.name})
 self.filteredStreetNames = self.streetNames
 OperationQueue.main.addOperation {
 self.streetTableView.reloadData()
 }
}


Results:

The process time went from 30 seconds to 0.4 seconds by using Set

answered Apr 27, 2018 at 9:36

6 Comments

Your implementation of == is wrong: Different strings can have identical hash values.
Do you have any suggestions to solve this? So far i'm having no problem with this, but you're correct about that @MartinR
If equality is based only on the street names then just return lhs.name == rhs.name
Btw, making type itself hashable is not exactly what I suggested in stackoverflow.com/a/42542237/1187415, but to keep all names seen so far in a Set<String>.
Thanks it's working perfectly, I'll edit the answer! Really, thanks @MartinR
|

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.