System: Windows 10
QGIS version 3.28 'Firenze'
I have a line shapefile with a feature count of 188.722. For each feature I would like to store a unique id in the attribute table with a field named 'main_id'.
However, at the moment, I'm iterating all features which causes QGIS to freeze.
My code so far is:
# import relevant libraries
import os
from qgis.core import *
# get the path to the shapefile e.g. /home/project/data/ports.shp
UBA_network_proj = "full/path/to/shapefile.shp"
# create Qgis vector layer object
vlayer = QgsVectorLayer(UBA_network_proj, "main_roads_layer", "ogr")
# Create and add new empty field to layer attribute table
layer_provider=vlayer.dataProvider()
layer_provider.addAttributes([QgsField("main_id",QVariant.Int)])
vlayer.updateFields()
# Start editing mode, iterate features, get their ids and
# store it in the newly created field 'main_id', and update the layers attribute table
vlayer.startEditing()
features=vlayer.getFeatures()
for f in features:
id=f.id()
value = id
attr_value={12:value}
layer_provider.changeAttributeValues({id:attr_value})
layer.commitChanges()
However, when I iterate the features QGIS starts to freeze for a long time, and I actually haven't tried to even let it run till the end, as I know, that iterating almost 190.000 features cannot be the most performant way to do so.
Is there a more performant way to do this with PyQGIS?
For example, with the processing module.
-
To prevent QGIS from freezing when using the console, you can "pass your code" to the task manager. There are different ways to do this: opengis.ch/2018/06/22/threads-in-pyqgis3 docs.qgis.org/3.28/en/docs/pyqgis_developer_cookbook/tasks.html youtube.com/@awopbob4725/videos gitlab.com/awopbob_qgis/taskManager/-/blob/main/…thipa– thipa2023年04月18日 11:31:46 +00:00Commented Apr 18, 2023 at 11:31
-
Thank you, I will try it out!i.i.k.– i.i.k.2023年04月18日 12:16:34 +00:00Commented Apr 18, 2023 at 12:16
1 Answer 1
There are some redundancies in your code, and you are mixing layer editing methods with provider methods, which is not recommended. At the end of the day, I don't think you can avoid feature iteration at some point, but you definitely don't need to make a call to changeAttributeValues()
on every iteration.
Try the simplified snippet below. An attributes map can contain as many feature id keys as there are features, with the values being a second dictionary object with a field index as key and new attribute as a value. We can take advantage of this structure by using dictionary comprehension to build a single attribute map, then making a single changeAttributeValues()
call at the end of the script, passing in the attributes map. On a test layer (shapefile) with 180500 features, this did the job in around 2-3 seconds.
lyr_path = '/home/ben/test/SHP/test_layer.shp'
vlayer = QgsVectorLayer(lyr_path, 'Main_roads_layer', 'ogr' )
vlayer.dataProvider().addAttributes([QgsField('main_id', QVariant.Int)])
vlayer.updateFields()
fld_idx = vlayer.fields().lookupField('main_id')
atts_map = {ft.id(): {fld_idx: ft.id()} for ft in vlayer.getFeatures()}
vlayer.dataProvider().changeAttributeValues(atts_map)
print('Done')
A few details on the testing, because there may be other variables to consider. QGIS 3.28, Ubuntu 22.04. Desktop machine with 32gb RAM, Ryzen 5 5600G with integrated graphics (no GPU). The shapefile was stored locally. I have noticed at work that saving layer edits is much slower when working on layers stored on a network drive compared to layers stored on the machine.
-
Thank you so much!When I understand it correctly, you use dict comprehension instead of iteration? Can you elaborate a little on the code?i.i.k.– i.i.k.2023年04月18日 11:36:22 +00:00Commented Apr 18, 2023 at 11:36
-
1@i.i.k. No problem! I'm guessing it worked alot quicker? As commented by ThiPa, you can use
QgsTask
to push long running task to a background thread and keep the main ui responsive, but I don't think it should really be necessary here. By the way, if it solved your problem, you could consider accepting the answer ;-)Ben W– Ben W2023年04月18日 11:42:40 +00:00Commented Apr 18, 2023 at 11:42 -
1@ i.i.k. Correct- dict comprehension to build a single attribute map, and make a single call to
changeAttributeValues()
.Ben W– Ben W2023年04月18日 11:46:51 +00:00Commented Apr 18, 2023 at 11:46 -
@i.i.k. I added a bit more info to the answer.Ben W– Ben W2023年04月18日 11:54:32 +00:00Commented Apr 18, 2023 at 11:54
-
@ Ben W Just tried it out and accepted! :) It works much faster, as you said, ~2/3 sec. Perfect! And I learned about dict comprehension, which I haven't used so far!i.i.k.– i.i.k.2023年04月18日 11:56:14 +00:00Commented Apr 18, 2023 at 11:56