4

I need to identify/select/print the duplicate values in a field depending on the values of another one.

Example:

Field1 Field2
100 1
100 2
200 1
200 2
200 3
200 2
300 1
300 2
300 3
300 4

Basically, for every value in "Field1" a unique value should be in "Field2": 100.1, 100.2, etc. it's ok, to have two 200.2 it's not. I'm using these PyQGIS lines to find duplicates within the same field:

import collections
layer = iface.activeLayer()
list_of_values = QgsVectorLayerUtils.getValues(layer, 'Field1')[0]
list_of_values.sort()
print([item for item, count in collections.Counter(list_of_values).items() if count > 1])

but I can't modify them in the proper way to achieve these results.

asked Jun 18, 2022 at 15:54
2
  • 1
    What exactly do you want your result to be? Do you want to know for which values duplicates exist? Do you want to know the feature IDs of all duplicates? Do you want the feature IDs of all but the first duplicate? Or something completely different? Commented Jun 18, 2022 at 16:21
  • @bugmenot123, The main aim is to identify the features whose combination of Field1 and Field2 is a duplicate, because for every value in Field1 you can have n values in Field2 but they have to be sequential and in particular their combination must be unique. Being sequential is not mandatory (100.1, 100.2, 100.23, 100.24 is fine as well), having unique couples of values it is (just one 100.7 is allowed). Commented Jun 19, 2022 at 6:16

2 Answers 2

4

I like collections.defaultdict(list):

from collections import defaultdict
layer = QgsProject.instance().mapLayersByName('ok_ak_riks')[0]
fieldnames = ['kom_kod', 'lan_kod']
d = defaultdict(list)
for f in layer.getFeatures():
 d[(str(f[fieldnames[0]]), str(f[fieldnames[1]]))].append(f.id()) #For each combination of field1 and 2, append all features ids to a list
 
#d[('0380', '03')]
#[288, 289]
#So we have two features, 288 and 289, with the value 0380 in field1 and 03 in field 2
to_select = []
for key, idlist in d.items():
 if len(idlist)>1: #If there's more than one feature with current field1 and field2 combination, add them to the list
 to_select.extend(idlist)
 
#to_select
#[11, 12, 33, 34, 35, 44, 45, 49, 50, 54, 55, 62, 63, 64, 65, 80, 81, 82, 83, 93, 94, 95, 100, 101, 108, 109, 114, 115, 116, 117, 118, 119, 120, 122, 123, 130, 131, 159, 160, 163, 164, 200, 201, 203, 204, 208, 209, 214, 215, 224, 225, 230, 231, 253, 254, 279, 280, 288, 289, 290, 291, 292, 323, 324, 325, 326] 
layer.select(to_select)

enter image description here

answered Jun 18, 2022 at 16:27
3

Try this:

layer = iface.activeLayer()
field_a = 'Field1' # set inside quotes the name of Field1
field_b = 'Field2' # set inside quotes the name of Field2
feat_list = [(feature.attribute(field_a), feature.attribute(field_b)) for feature in layer.getFeatures()]
selection = []
for id, feature in enumerate(feat_list):
 if feat_list.count(feature) > 1:
 selection.append(id)
layer.select(selection)
answered Jun 18, 2022 at 23:52
3
  • 2
    It works, thanks. But I have to say that it' much slower than @BERA's solution, and with a relatively small test dataset . Commented Jun 19, 2022 at 17:29
  • @HyPhens you are right, it is very inefficient, I updated my answer with a script with better performance Commented Jun 19, 2022 at 18:19
  • I've upvoted both @BERA's and Mayo's solutions as they work well for my needs for now. Commented Jul 3, 2022 at 19:17

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.