Using QGIS version 2.18.11
I have an attribute table with a column of over 500K of records that has many missing spellings, abbreviations and erroneous characters that I need cleaned up on a regular basis. I can use regexp in the field calculator but can only run one expression at a time. I need to run approximately 150 expressions.
Like:
regexp_replace("DirtyNames" ,'([-\/~!@#$%^&*{}:"<>?|/.,;=_+]?)', '') and regexp_replace("DirtyNames" ,'TRL', 'TRAIL')
I’m thinking a Python script would likely be the easiest way to do this but don’t know the Python language. The table is "Local_Streets". The field is "DirtyNames" and the regexp results would go to the "CleanedNames" field.
# Run_Regexp_Statements
####################################
from qgis.utils import iface
from PyQt4.QtCore import QVariant
_DIRTY_FIELD = 'DirtyNames'
_CLEAN_FIELD = 'CleanedNames'
layer = iface.activeLayer()
layer.startEditing()
# Create a field to store the results
layer.dataProvider().addAttributes(
[QgsField(_CLEAN_FIELD, QVariant.String),
layer.updateFields()
# **WHAT NEEDS TO GO HERE to be able to run the regexp’s ?**
# regexp_replace("DirtyNames" ,'([-\/~!@#$%^&*{}:"<>?|/.,;=_+]?)', '')
# regexp_replace("DirtyNames" ,'TRL', 'TRAIL')
Layer.commitChanges()
print 'Processing complete.'
1 Answer 1
You'll need the re library.
import re
replaced = re.sub(r'([-\/~!@#$%^&*{}:"<>?|/.,;=_+]?)',r'replacementpattern',dirty_line))
More info.
-
Can the following line be :
replaced = re.sub('TRL','TRAIL',replaced)
(and so on) in your code ?gisnside– gisnside2017年09月04日 19:48:13 +00:00Commented Sep 4, 2017 at 19:48 -
Thanks Ena2345 and gisnside for your replies. I get a invalid syntax error on both. I looked at the link provided by Ena but I'm not really following what I'm doing wrong.<br> import re<br> replaced = re.sub('TRL','TRAIL',r'replacementpattern',_DIRTY_FIELD)Phil_in_Tx– Phil_in_Tx2017年09月04日 20:46:42 +00:00Commented Sep 4, 2017 at 20:46
-
Well the reason why it doesn't work is because it only works for 1 value at the time. So if you looped through the layer you could do it. But such operation should cost you O(n). I don't know how much such operations delay on QGIS so I cannot estimate the time taken.Piskr– Piskr2017年09月05日 21:11:23 +00:00Commented Sep 5, 2017 at 21:11